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ABSTRACT 

We report the distribution of planets as a function of planet radius, orbital period, and stellar 
effective temperature for orbital periods less than 50 days around Solar-type (GK) stars. These results 
are based on the 1,235 planets (formally "planet candidates") from the Kepler mission that include a 
nearly complete set of detected planets as small as 2 R^. For each of the 156,000 target stars we assess 
the detectability of planets as a function of planet radius, i?p, and orbital period, P, using a measure 
of the detection efficiency for each star. We also correct for the geometric probability of transit, Ri,/a. 
We consider first Kepler target stars within the "solar subset" having Tcff = 4100-6100 K, \ogg — 
4.0-4.9, and Kepler magnitude Kp < 15 mag, i.e. bright, main sequence GK stars. We include only 
those stars having photometric noise low enough to permit detection of planets down to 2 i?^. We 
count planets in small domains of Rp and P and divide by the included target stars to calculate planet 
occurrence in each domain. The resulting occurrence of planets varies by more than three orders of 
magnitude in the radius-orbital period plane and increases substantially down to the smallest radius 
(2 Rt^) and out to the longest orbital period (50 days, ~0.25 AU) in our study. For P < 50 days, 
the distribution of planet radii is given by a power law, d//dlogi? — kj^R"' with kfj — 2.9^q'4, a 
= — 1.92± 0.11, and R = Rp/R^. This rapid increase in planet occurrence with decreasing planet 
size agrees with the prediction of core-accretion formation, but disagrees with population synthesis 
models that predict a desert at super-Earth and Neptune sizes for close-in orbits. Planets with orbital 
periods shorter than 2 days are extremely rare; for Rp > 2 Rq we measure an occurrence of less than 
0.001 planets per star. For all planets with orbital periods less than 50 days, we measure occurrence 
of 0.130 ± 0.008, 0.023 ± 0.003, and 0.013 ± 0.002 planets per star for planets with radii 2-4, 4-8, 
and 8-32 i?^, in agreement with Doppler surveys. We fit occurrence as a function of P to a power 
law model with an exponential cutoff below a critical period Pq. For smaller planets, Pq has larger 
values, suggesting that the "parking distance" for migrating planets moves outward with decreasing 
planet size. We also measured planet occurrence over a broader stellar Tes range of 3600-7100 K, 
spanning MO to F2 dwarfs. Over this range, the occurrence of 2-4 i?® planets in the Kepler field 
linearly increases with decreasing Tes, making these small planets seven times more abundant around 
cool stars (3600-4100 K) than the hottest stars in our sample (6600-7100 K). 

Subject headings: planetary systems, stars: statistics — techniques: photometry 
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1. INTRODUCTION 

The dominant theory for the formation of planets 
within 20 AU involves the collisions and sticking of plan- 
etesimals having a rock and ice composition, growing to 
Earth-size and beyond. The presence of gas in the proto- 
planetary disk allows gravitational accretion of hydrogen, 
helium and other volatiles, with accretion rates depend- 
ing on gas density and temperature, and hence on loca- 
tion within the disk and its stage of evolution. The rel- 
evant processes, including inward migration, have been 
simulated numerically both for individ ual planet growth 
and fo r entire populatio ns of planets (llda fc LiiJ 2004 , 
2008bt iMo rdasini et al.' '2009a': 'Schl aufman et al.l 
Ida k Linl bOlO: Alibert et al. 2011). 



2010; 



The simulations suggest that most planets form near 
or beyond the ice line. When they reach a critical mass 
of several Earth-masses (M^) the planets either rapidly 
spiral inward to the host star because of the onset of 
Type II migration or undergo runaway gas accretion 
and become massive gas-giants, thus producing a "planet 
desert" fjida & Lin 2008a). The predicted desert resides 
in the mass range ~l-20 orbiting inside of ^1 AU, 
with details that vary w ith ass umed behavior of inward 
plane t migration (llda fc Lin 20Q1H 120101: lAhbert et all 
120111: iSchlaufman et al.l l2009l) . Another prediction is 
that the distribution of planets in the mass/orbital dis- 
tance plane is fairly uniform for masses above the planet 
desert (>20 Af©) and inside of ~0.25 AU (periods less 
than 50 days). The majority of the planets in these mod- 
els reside near or beyond the ice line at ^2 AU (well 
outside of the P < 50 days domains analyzed here). 
The mass distribution for these dis tant pla nets rises 
toward super-Earth and Earth-mass (llda fc Li n 2008b; 
IMordasini et al.ll2009bl: lAlibert et al.ll201lD . These pat- 
terns of planet occurrence in the two-parameter space de- 
fined by planet masses and orbital periods can be directly 
tested with observations of a statistically large sample of 
planets orbiting within 1 AU of their host stars. 

Two early observational tests of the planet-formation 
sim ulations have e merge d. Using RV-detected plan- 
ets, Howard et al.l (|2010[ ) measured planet occurrence 
for close-in planets (P < 50 days) with masses that 
span nearly three orders of magnitude — super-Earths to 
Jupiters (Mp sin j = 3-1000 Mq). This Eta-Earth Survey 
focused on 166 G and K dwarfs on the main sequence. 
The survey showed an increasing occurrence, /, of plan- 
ets with decreasing mass, M, from 1000 to 3 M^. A 
power law fit to the observed distribution of planet mass 
gave d//dlogM = 0.39M-° '*s. Remarkably, the survey 
revealed a high occurrence of planets in the period range 
P = 10-50 days and mass range Mp sini= 4-10 Mq, pre- 
cisely within the predicted planet desert. Planets with 
MpSini = 10-100 Me and P < 20 days were found to 
be quite rare. Thus, the predicted desert was found to 
be full of planets and the predicted uniform mass dis- 
tribution for close-in planets above the desert was found 
to be rising with smaller mass, not flat. These discrep- 
ancies suggest that current population synthesis models 
of planet formation around solar-type stars are somehow 
failing to explain the distribution of low-mass planets 
around solar-type stars. 

Accounting for completeness, Howard et al.l (j2010l ) 
found a planet occurrence of 15^4% for planets with 



Mp sin i = 3-30 Mq and P < 50 d around main sequence 
G and K stars. In contrast. Mayor et al. have asserted 
a sub stantially high er planet occurrence of 30% ± 10% 
( Mav or et al.l I20(39l ) or higher with a careful statistical 
study still in progress. Thus, there may be observational 
discrepancies in planet occurrence which we expect to 
be resol ved soon. Stil l, ther e is q ualitative agr e ement 
between Howard et all ()2010f ) and iMavor et all ()2009f ) 
that the predicted paucity of planets of mass ^^1-30 
M(^ within 1 AU is not observed, as that close-in do- 
main is, in fact, rich with small planets. The planet can- 
didates from Kepler^ along with a careful assessment of 
both false positive rates and completeness, can add a key 
independent measure of the occurrence of small planets 
to compare with the Eta-Earth Survey and Mayor et al. 
Formally these objects are "planet candidates" as a small 
percentage will turn out to be false positive detections; 
we often refer to them as "planets" below. 

The observed occurrence of small planets orbiting 
clo se-in matches continuo usly with the similar analysis 
bv lCumming et afl ()2008D who measured 10.5% of Solar- 
type stars hosting a gas-giant planet (MpSini = 100- 
3000 Me, P = 2-2000 days), for which planet occur- 
rence varies as df oc M-°-3i±o-2pO-26±o.i ^jjog^j dlogP. 
Thus, the occurrence of giant planets orbiting in 0.5-3 
AU seems to attach smoothly to the occurrence of plan- 
ets down to 3 Me orbiting within 0.25 AU. This suggests 
that the formation and accretion processes are continu- 
ous in that domain of planet mass and orbital distance, 
or that the admixture of relevant processes varies con- 
tinuously from 1000 Mq down to 3 M^. 

Planet formation theory must also account for remark- 
able orbital properties of exoplanets. The orbital ec- 
centricities span the range e — 0-0.93 and the close- 
in "hot Jupiters" show a wide distribution of align- 
ments (or misalignm ents) with the equatorial plane 
of the host star (e.g.. [Johnson et al.l 120091: IWinn et al.l 
l20T0ll20TltlTriaud et al.ll2010HMorton fc Johnsonll2010D . 
Thus, standard planet formation theory probably re- 
quires additional planet-planet gravitational interac- 
tions to e xplain these non-circula r and non-coplanar or- 
bits fe.g. lChatteriee et"aII 120101 iWu fc Lithwicg 120111: 
INagasawa et al.ll2008D . 

The distribution of planets in the mass/orbital-period 
plane reveals important clues about planet formation 
and migration. Here we carry out an analysis of the 
epoc hal Kepler results fo r transiting planet candidates 
from lBorucki et all ()201lD with a careful treatment of the 
completeness. We focus attention on the planets with or- 
bital periods less than 50 days to match the period range 
that RV surveys are most sensitive to. The goals are to 
measure the occurrence distribution of close-in planets, 
to independently test planet population s ynthesis mod- 
els, an d to check the Doppler RV results of lHoward et al.l 
(I20T0I) . While none of the planets or stars are in com- 
mon between Kepler and RV surveys, we will combine 
the mass distribution (from RV) and the radius distribu- 
tion (from Kepler) to constrain the bulk densities of the 
types of planets they have in common. Planet formation 
models predict great diversity in the interior structures 
of planets having Earth-mass to Saturn-mass, caused by 
the various admixtures of rock, water-ice, and H fc He 
gas. Here we attempt to statistically assess planet radii 
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and masses to arrive for the first time at the density dis- 
tribution of planets within 0.25 AU of their host stars. 

2. SELECTION OF KEPLER TARGET STARS AND 
PLANET CANDIDATES 

We seek to determine the occurrence of planets as a 
function of orbital period, planet radius (from Kepler) 
and planet mass (from Doppler searches). Measuring oc- 
currence using either Doppler or transit techniques suf- 
fers from detection efhciency that is a function of both 
the properties of the planet (radius, orbital period) and of 
the individual stars (notably noise from stellar activity). 
Thus the effective stellar sample from which occurrence 
may be measured is itself a function of planet properties 
and the quality of the data for each target star. A key 
element of this paper is that only a subset of the target 
stars are amenable to the detection of planets having a 
certain radius and period. 

To overcome this challenge posed by planet detection 
completeness, we construct a two-dimensional space of 
orbital period and planet radius (or mass). We divide 
this space into small domains of specified increments in 
period and planet radius (or mass) and carefully deter- 
mine the subset of target stars for which the detection 
of planets in that small domain has high efficiency. In 
that way, each domain of orbital period and planet size 
(or mass) has its own subsample of target stars that are 
selected a priori, within which the detected planets can 
be counted and compared to that number of stars. This 
treatment of detection com pleteness for ea c h targ et star 
was successfully adopted by [Howard et al.l ()201CII ) in the 
assessment of planet occurrence as a function of orbital 
period and planet mass (Mp sinz) from Doppler surveys. 
Here we carry out a similar analysis of occurrence of plan- 
ets from the Kepler survey in a two-dimensional space of 
orbital period and planet radius. 

2.1. Winnowing the Kepler Target Stars for High 
Planet Detectability 

To measure planet occurrence we compare the number 
of detected planets having some set of properties (radii, 
orbital periods, etc.) to the set of stars from which plan- 
ets with those properties could have been reliably de- 
tected. Errors in either the number of planets detected 
or the number of stars surveyed corrupt the planet occur- 
rence measurement. We adopt the philosophy that it is 
preferable to suffer higher Poisson errors from consider- 
ing fewer planets and stars than the difficult-to-quantify 
systematic errors caused by studying a larger number of 
planets and stars with more poorly determined detection 
completeness. 

We begin our winnowing of target stars with 
the Kepler Input Ca talog (KIC; IBrown et all 120111: 
IKepler Mission TeamI [2009,) . In this paper we include 
only planet candidates found in three data segments 
("Quarters") labeled QO, Ql, and Q2 for which all 
photometry is published (jBorucki et al.l 120111) . QO was 
data commissioning (2-11 May 2009), Ql includes data 
from 13 May to 15 June 2009, and Q2 includes data 
from 15 June to 17 September 2009. The segments had 
durations of 9.7, 33.5, and 93 days, respectively. Ke- 
pler achieved a duty cycle of greater than 90%, which 
almost comple tely eliminated window function effects 
(|von Braun et al..,2009, ). A total of 156,097 long cadence 



targets (30 min integrations) were observed in Ql and 
166,247 targets were observed in Q2, with the targets in 
Q2 being nearly a superset of those in Ql. In this paper 
we consider only the "exoplanet target stars" of which 
there were 153,196 observed during Q2, and are used for 
the statistics presented here (Bat alha et al. 2010 ). (The 
remaining Kepler targets in Q2 were evolved stars, not 
suitable for sensitive planet detection.) The few percent 
changes in the planet-search target stars are not signifi- 
cant here as Q2 data dominate the planet detectability. 
The KIC contains stellar Tcff and radii (i?*) that are 
based on four visible-light magnitudes {g,r,i,z) and a 
fifth, D51, calib rated with model a tmospheres, and JHK 
IR magnitudes (jBrown et al.ll201lD . 

The photometric calibrations yield Tes reliable to 
±135 K (rms) and surface gravity loggr reliable to 
±0.25 dex (rms), based on a comparison of KIC values 
to results of high resolution spectr a obtained with th e 
Keck I telescope and LTE analysis (jBrown "eFalllMll) . 
Stellar radii are estimated from and lo g q and carry 
an u ncertainty of 0.13 dex, i.e. 35% rms ([Brown et al.l 
[Ml . There" is a concern that values of log (7 for sub- 
giants are systematically overestimated, leading to stel- 
lar radii that are smaller than their true radii perhaps 
by as much as a factor of two. One should be concerned 
that a magnitude-limited survey such as Kepler may fa- 
vor slightly evolved stars, implying systematic underes- 
timates of stellar radii, an effect worth considering at 
the interpretation stage of this work. The quoted planet 
radii may be too small by as much as a factor of two for 
evolved stars. We adopt these KIC values for stellar Toff 
and Ri, fr om the KIC and thei r associated uncertainties, 
following iBorucki et al.l (|2011[ ). The stellar metallicities 
are poorly known. The KIC is available on the Multi- 
Mission Archive at the Space Telescope Science Institute 
(MAST) websit(E|. 

In this paper, we primarily consider Kepler target 
stars having properties in the core of the Kepler mission, 
namely bright solar-type main sequence stars. Specif- 
ically, we consider only Kepler target stars within this 
domain of the H-R diagram: Tcff = 4100-6100 K, logg 
= 4.0-4.9, and Kepler magnitude Kp < 15 mag (Table[T]). 
These parameters select for the brightest half of the GK- 
type target stars (the other half being fainter, Kp > 15 
mag), as shown in Figure[T] The goal is to limit our study 
to main seq uence GK stars well characterized in the KIC 
(jBrown et al. 2011) and t o provide a stel l ar sam ple that 
is a close match to that of IHoward et al.l ([20101) , offering 
an opportunity for a comparison of the radii and masses 
from the two surveys. The brightness limit of Kp < 15 
promotes high photometric signal-to-noise ratios, needed 
to detect the smaller planets. These three criteria in Toff, 
log 5, and Kp seem, at first glance, to be quite modest, 
representing the core target stars in the Kepler mission. 
Yet these three stellar criteria yield a subsample of only 
58,041 target stars, roughly one third of the total Kepler 
sample. In this study, we consider only this subset of Ke- 
pler stars and the associated planet candidates detected 
among them. 

2.2. Winnowing Kepler Target Stars by Detectable 
Planet Radius and Period 

http:/ /archive. stsci.edu/kepler/ 
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TABLE 1 

Properties of Stellar, and Planetary Samples 



Parameter 



Value 



Stellar effective temperature, T^g 4100-6100 K 

Stellar gravity, logg (cgs) 4.0—4.9 

Kepler magnitude, Kp < 15 

Number of stars, ra* 58,041 

Orbital period, P < 50 days 

Planet radius, Rp 2-32 

Detection threshold, SNR (90 days) > 10 

Number of planet candidates, Upi 438 




10000 9000 8000 7000 6000 5000 4000 
Stellar Effective Temperature, Tg„ (K) 

Fig. 1. — Kepler target stars (small black dots) and Kepler stars 
with planet candidates (red dots) plotted as a function of T^ff and 
logg from the KIC. Only bright stars (Kp < 15) are shown and 
considered in this study. The inner blue rectangle marks the "solar 
subset" (Tgfj = 4100-6100 K and logg = 4.0-4.9) of main sequence 
G and K stars considered for most of this study. This domain con- 
tains 58,041 stars with 438 planet candidates. In Section |3] we 
consider planet occurrence as a function of T^f{ . For that analysis 
we consider a broader range of T^ff = 3600-7100 K (green outer 
rectangle). The error bars in the upper left show the typical un- 
certainties of 135 K in T^g and 0.25 dex in logg. 



We further restrict the Kepler stehar sample by in- 
cluding only those stars with high enough photometric 
quality to permit detection of planets of a specified ra- 
dius and orbital period. To begin, we consider differ- 
ential domains in the two-dimensional space of planet 
radius and orbital period. For each differential domain, 
only a subset of the Kepler target stars have sufficient 
photometric quality to permit detection of such a planet. 
In effect, the survey for such specific planets is carried 
out only among those stars having photometric quality 
so high that the transit signals stand out easily (literally 
by eye). For photometric quality we adopt the metric of 
the signal-to-noise ratio (SNR) of the transit signal inte- 
grated over a 90 day photometric time series. We define 
SNR to be the transit depth divided by the uncertainty 
in that depth due to photometric noise (to be defined 
quantitatively below). 

We set a threshold, SNR > 10, which is highe r than 
that (SNR > 7.0) adopted bv lBorucki et al.l(|201lD . lend- 
ing our study an even higher standard of detection. Thus, 
we restrict our sample of stars so strongly that plan- 
ets of a specified radius and orbital period are rarely, if 
ever, missed by the "Transiting Planet Search" (TPS; 



iJenkins et al.l l2010d) pipeline. Moreover, we base our 
SNR criterion on just a single 90 day quarter of Ke- 
pler photometry. This conservatively demands that the 
photometric pipeline detect transits only during a single 
pointing of the telescope. (The CCD pixels that a partic- 
ular star falls on change quarterly as Kepler is rolled by 
90 degrees to maint ain solar illumination.) As noted in 
iBorucki et al.l ()2011[ ). the photometric pipeline does not 
yet have the capability to stitch together multiple quar- 
ters of photometry and search for transits. In contrast, 
the SNR quoted in Borucki et al. (2011) was based on the 
totality of photometry, Q0-Q5 (approximately one year 
in duration) . Thus we are setting a threshold tha t is con - 
siderably more stringent than in IBorucki et al.l ()2011l) , 
i.e. including target stars of the quietest photometric be- 
havior. The goal, described in more detail below, is to 
establish a subset of Kepler target stars for which the 
detection efficiency of planets (of specified radius and or- 
bital period) is close to 100%. 

Finally, we restrict our study to orbital periods under 
50 days. All criteria by which Kepler stars are retained in 
our study are given in Table [1] As demonstrated below, 
these restrictions on SNR > 10 and on orbital period 
(P < 50 days) yield a final subsample of Kepler targets 
for which very few planet candidates will be missed by 
the current Kepler photometric pipeline as the transit 
signals both overwhelm the noise and repeat multiple 
times (for P < 50 days). 

We explored the adoption of two measures of photo- 
metric SNR fo r each Kepler star, one taken directly from 
IBorucki et al.l ()201lD and the other using the so-called 
Combined Differential Photometric Precision (itcdpp) 
which is the empirical RMS noise in bins of a spec- 
ified time interval , com i ng fr om the Kepler pipeline. 
Actually, Borucki et al.l (j201lD derived their SNR val- 
ues from CTcDPP, integrated over all transits in Q0-Q5. 
We employed the measured (tcdpp for tir ne intervals of 
3 hr a nd compared the resulting SNR from IBorucki et all 
(|2011f) for transits to those we computed from the basic 
CCDPP- These values agreed well (understandably, ac- 
co unting for the u se of a total SNR from all five quarters 
in IBorucki et al.l (|2011D ). Thus, we adopted the basic 3 
hr (TCDPP for each target star as the origin of our noise 
measure. 

Each Kepler target star has its own measured RMS 
noise level, ctcopp- Typical 3 hr (Tcdpp values are 
30-3 00 parts per million (ppm), as shown in Figure 
1 of IJenkins et al.l ()2010b[ ). albeit for 6 hr time bins. 
Clearly, the photometrically noisiest target stars are less 
amenable to the detection of small planets, which we 
treat below. The noise has three sources. One is sim- 
ply Poisson errors from the finite number of photons 
received, dependent on the star's brightness, causing 
fainter stars to have higher ctcdpp- This photon-limited 
photometric noise is represented by the lower envelope 
of the noise as a fun ction of magnitude in Figure 1 of 
IJenkins et al.l ()2010b[) . A second noise source stems from 
stellar surface physics including spots, convective over- 
shoot and turbulence (granulation), acoustic p-modes, 
and magnetic effects arising from plage regions and re- 
connection events. A third noise source stems from ex- 
cess image motion in QO, Ql, and Q2 stemming from 
use of variable guide stars that have now been dropped. 
In Q2 the presence of bulk drift corrected by four re- 
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pointings of the bore sight, plus a safe mode followed 
by an unusually large thermal recovery also contributed. 
The measured (Tcdpp accounts for all such sources, as 
well as any unmentioned since it is an empirical mea- 
sure. 

Using CTcDPP for each target star, we define SNR inte- 
grated over all transits as, 

SNR=^,/^^. (1) 
CCDPP V ohr 

Here 5 = Rp/R^ is the photometric depth of a central 
transit of a planet of radius i?p transiting a star of ra- 
dius R^,, TT-tr is the number of transits observed in a 90 
day quarter, idur is the transit duration, and the factor 
of 3 hr accounts for the duration over which (tcdpp was 
measured. We include only those stars yielding SNR > 
10, for a given specified transit depth and orbital period. 
The threshold imposes such a stringent selection of tar- 
get stars that few planets are missed by the Kepler Tran- 
siting Planet Search (TPS) pipeline. Our planet occur- 
rence analysis below assumes that (nearly) all planets 
with Rp > 2Rq that meet the above SNR criteria have 
be en detected by the K epler pipeline and are included 
in iBorucki et al.l ()2011[ ). The Kepler team is currently 
engaged in a considerable study of the completeness of 
the Kepler pipeline by injecting simulated transit sig- 
nals into pipeline at the CCD pixel level and measuring 
the recovery rate of those signals as a function of SNR 
and other parameters. In advance of the results of this 
major numerical experiment, we demonstrate detection 
completeness of SNR > 10 signals in two ways. 

First, Figure [2] shows the SNR of detected transits 
as a function of Kepler Object of Interest (KOI) num- 
ber. The Kepler photometry and TPS pipeline detects 
planet candidates over the course of months as data ar- 
rive. There is a learning curve involved with this pro- 
cess, as b oth software matu res and human intervention 
is tuned (iRowe et al.ll2010l) . As a result, the obvious 
(high SNR) planet candidates are issued low KOI num- 
bers as they are detected early in the mission. The shal- 
lower transits, relative to noise, are identified later as 
they require more data, and are issued larger KOI num- 
bers. Thus KOI number is a rough proxy for the time 
required to accumulate enough photometry to identify 
the planet candidate. Among the KOIs 1050-1600, much 
less vetting was done, and indeed we rejected five planet 
candidate s (KOIs 1187 1227, 1387, 1391, and 1465) re- 
ported in IBorucki et al.l ()201lD based on both V-shaped 
light curves and at least one other property indicating a 
likely eclipsing binary. 

Figure [2] shows that the early KOIs, 1-1000, had a 
wide range of SNR values spanning 7-1000, as the first 
transit signals had a variety of depths. KOIs 400-1000 
correspond to pipeline detections of transit planet can- 
didates around target stars as faint as 15th mag and 
fainter. The more recent transit identifications of KOIs 
1000-1600 exhibit far fewer transits with SNR > 20 and 
about half of these new KOIs have SNR < 10, below our 
threshold for inclusion. Apparently most newly identi- 
fied KOIs have SNR < 20, and few planets remain to 
be found with P < 50 days and SNR > 20. Figure H 
suggests that the great majority of planet candidates with 
P < 50 days and SNR > 10 have already been identified 



TABLE 2 

Properties of Planet Candidates in Figure [3] 



KOI 


Kp 




Rp 


P 


SNR 


SNR 




(mag) 


(R©) 


(Ke) 


(days) 


(Q0-Q5) 


(90 days) 


223.02 


14.7 


0.74 


2.40 


41.0 


25 


12.3 


542.01 


14.4 


1.13 


2.70 


41.9 


21 


11.2 


592.01 


14.3 


1.08 


2.70 


39.8 


19 


9.7 


711.01 


14.0 


1.00 


2.74 


44.7 


34 


25.3 



by the Kepler pipeline. This apparent asymptotic success 
in the detection of SNR > 10 transits is enabled by our 
orbital period limit of 50 days which is considerably less 
than the duration of a quarter (90 days). The current 
Kepler pipeline for identifying transits within a single 90 
day quarter is more robust than the multi-quarter transit 
search. For such short periods, at least two transits typ- 
ically occur within one quarter. Moreover, when such 
planet candidates appear during another quarter, the 
short period planets are quickly confirmed. We suspect 
that for periods greater than 90 days, many more planet 
candidates are yet to be identified by Kepler. Thus, this 
study restricts itself to P < 50 days in part because 
of the demonstrated completeness of detection for such 
short periods. 

We examined the light curves themselves for a second 
demonstration of nearly complete detection efficiency of 
planet candidates with P < 50 days, Pp > 2 P®, and 
SNR > 10. Figure [3] shows four representative light 
curves of planet candidates whose properties are listed 
in Table [2l All four have small radii of 2-3 P® and 
"long" periods of 30-50 days, the most difficult domain 
for planet detection in this study (the lower right corner 
of Figure SI discussed below) . The SNR values are near 
the threshold value of ~10; in fact, one planet candidate 
(KOI 592.01) has a SNR of 9.7 and is therefore conser- 
vatively excluded from this study. The four light curves 
show how clearly such transits stand out, indicating the 
high detection completeness of planets down to 2 P© and 
P < 50 days for the SNR > 10 threshold we adopted. 

2.3. Identifying Kepler Planet Candidates 

We adopt the Kepler planet candidates and their 
orbital periods an d planet radii from Table 2 of 
IBorucki et al.l (|2011[ ). with two exceptions. First, we ex- 
clude the five KOIs noted above that are likely to be false 
positives. Second, we exclude KOIs that orbit "unclassi- 
fie d" KIC stars ( identified with "Teft Flag" = 1 in Table 1 
of IBorucki et"aLl((20Tll) ). We measure planet occurrence 
only around stars with well defined stellar parameters 
from the KIC. ^ 

To summarize the IBorucki et al.l (j2011[ ) results, pho- 
tometry at roughly 100 ppm levels in 29.4 minute in- 
tegrations allows detection of repeated, brief drops in 
stellar brightness caused by planet transits across the 
star. The technical specifics of the instrument, ph otome- 
try, and transit detection are described in Boruck i et al.l 
('2010a'); Koch et alj ([2010^); IJenkins et al.l (|2010bl lc[): 
Caldwell et al. (2013). We begin the identification of 
planet candidates based on those revealed in public Ke- 
pler photometric data (Q0-Q2). This data release con- 
tains 997 stars with a total of 1,235 planetary candidates 
that show transit-like signatures, all with some follow- 
up work that could not rule out the planet hypothe- 
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Fig. 2. — Signal to Noise Ratio (SNR) of detected transits during a 90 day quarter. See text for definition of SNR. Only planets orbiting 
stars with T^g = 4100-6100 K and log 3 = 4.0—4.9 are shown. Planets orbiting stars with Kepler magnitudes Kp <13, 13-14, 14-15, and 
> 15 are shown in red, yellow, green, and blue, respectively. Planets with orbital periods P < 8, 8-16, 16-32, and > 32 days are shown as 
filled circles, squares, five-point stars, and triangles, respectively. Our analysis considers only transits with SNR > 10 (upper dashed line). 



sis (jGautier et al.ll2010D . IBorucki et all (|2011[ ) includes 
three pla nets discovered in the Ke pler field befo re launch: 
TrES -2b (|0'Donovan et all 120061) . HAT-P-7 b (IPal et al.l 
[200l) . and HAT-P-llb (lBatos_eraI][2010i). We are in- 
cluding only those planet candidates that meet two SNR 
standards: They must have SNR > 10 in one quarter 
alone and they must have SNR > 7 in all quarters. The 
former standard should guarantee the latter, but this 
double-standard reinforces the quality of the planet can- 
didates. 

As this data release contains 136 days of photometric 
data, with only a few small windows of down time, most 
planet candidates with periods under 50 days have ex- 
hibited two or more transits. The multiple transits for 
P < 50 days offer relatively secure candidates, periods, 
and radii, provided by the repeated transit light curves. 
For P < 40 days, Kepler has detected typically three or 
more transits i n the publicly available data. Moreover, in 
IBorucki et al.l ()20lH ) the periods, radii, and ephemerides 
are based on the full set of Kepler data obtained in Q0~ 
5, constituting over one year of photometric data. Thus, 
planet candidates with periods under 50 days are securely 
detected with multiple transits. They have improved 
SNR in the light curves from the full set of data available 



to the Kepler team, offering excellent verification, radii, 
and periods for short period planets. 

2.4. False Positives 

We exp ect that some of th e planet candidates re- 
ported in IBorucki et al.l ()2011[ ) are actually false posi- 
tives. These would be mostly background eclipsing bi- 
naries diluted by the foreground star. They may also be 
background stars orbited by a transiting planet of larger 
radius, but diluted by the light of the foreground star 
mimicking a smaller planet. False positives can also oc- 
cur from gravitationally bound companion stars that are 
eclipsing binaries or have larger transiting planets. We 
expect that false positive pro babilities will be esti mated 
for most plane t candidates in IBorucki et al.l (|2011[ ) using 
"BLENDER" ()Torres et al.ll201lD . 

In the mean tim e, the false positive rate ha s been esti- 
mated carefully by iMorto^i&IJohniOT ('2MT'). They find 
the false positive probability for candidates that pass the 
standard vetting gates to be less than 10% and normally 
closer to 5%. In particular, the Kepler vetting process 
included a difference analysis between CCD images taken 
in and out of transit, allowing direct detection of the pixel 
that contains the eclipsing binary, if any. This vetting 
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rate to substantially impact our statistical results below. 

Nea rly all of the KOIs reported in IBorucki et alj 
(|2011h are form ally "planet cand idates", absent 



mination (Borucki et alJ l2010bl 


Koch et all l2010al: 
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20101: Jenkins et al.l 
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Fig. 3. — Four representative light curves of planet candidates with Rp = 2-3 and P = 30-50 days, the domain of most challenging 
detection in this study. See Table [2] for planetary and stellar properties. In each panel, the transit light curve (lower, red trace) and 
photometric measurements 180 degrees out of phase (upper, green trace) are shown with the best-fit model overlaid. Plus and star symbols 
show alternating transits. Only photometry from Q2 is displayed. These light curves demonstrate the data quality for the lowest SNR 
planet candidates included in this study, most with a transit depth of only ~10 times greater than the uncertainty in the mean depth due 
to noise. Still, the transits are clearly visible to the eye. The Kepler pipeline is unlikely to miss many of these planet candidates, despite 
their being in the least detectable domain of the study. This indicates the security of these detections and the high completeness of such 
planet candidates, in support of Figure [2] 

process found that ~12% of the original planet candi- 
dates were indeed eclipsing binaries in neighboring pix- 
els, and these were deemed false po sitives and removed 
from Table 2 of lBorucki erall ()2011[ ). This process leaves 
only the one pixel itself, with a half-width of 2 arcsec 
within which any eclipsing binary must reside. As 12% of 
the planet candidates had an eclipsing binary within the 
~10 pixels total of the photometric aperture, the rate of 
eclipsing binaries hidden behind the remaining one pixel 
is likely to be ~1.2%, a small probability of false pos- 
itives. The bound, hierarchical eclipsi ng binaries were 
estimated by iMorton fc JohnsonI (|2011f) , finding another 
few percent may be such false positi ves, yielding a total 
false p ositive probability of ^^5-10%. IMorton fc JohnsonI 
l|2011h note that the false positive probability depends 
on transit depth 5, galactic latitude b, and Kp. Using 
their "detailed framework" and computing the false pos- 
itive probability for each of the 438 planet candidates 
among our "solar subset" (Table [1]), we estimate that 22 
planet candidates are actually false positives^ The re- 
sulting false positive rate of 5% is on the low end of the 
5-10% estimate above because we restricted our stellar 
sample to bright main sequence stars and planet sample 
to i?p > 2 i?0. We do not expect this low false positive 

We note that while the precise details of these estimates de- 
pend on a priori assumptions of the overall planet occurrence rate 
(which we conservatively take to be 20%) and of the planet radius 
distribution (which follows Figure 5 of Morton & Johnson (2011)), 
the overall low false positive probability is controlled by the rela- 
tive scarcity of blend scenarios compared to planets. We also note 
that these estimates do not account for uncertainties in ij*, which 
may result in some jovian-sized candidates actually being M dwarfs 
eclipsing subgiant stars. 



For simplicity we will refer 
to all KOIs as "planets", bearing in mind that a small 
percentage will turn out to be false positives. 

3. PLANET OCCURRENCE 

We define planet occurrence, /, as the fraction of a 
defined population of stars (in Toff, fogff, Kp) having 
planets within a domain of planet radius and period, 
including all orbital inclinations. We computed planet 
occurrence as a function of planet radius and orbital pe- 
riod in the grid of cells in Figure ID Within each cell we 
counted the number of planets detected by Kepler for the 
subset of stars surveyed with sufficient precision to com- 
pute the local planet occurrence, /cdi- Our treatment 
corrects for planets not detected by Kepler because of 
non-transiting orbital inclinations and because of insuf- 
ficient photometric precision. 

The average planet occurrence within a confined cell 
of Rp and P is 



/cell — 



^pl,ccll / 



(2) 



where the sum is over all detected planets within the cell 
that have SNR > 10. In the numerator, pj = {Ri,/a)j 
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Fig. 4. — Planet occurrence as a function of planet radius and orbital period for P < 50 days. Planet occurrence spans more than three 
orders of magnitude and increases substantially for longer orbital periods and smaller planet radii. Planets detected by Kepler having 
SNR > 10 are shown as black dots. The phase space is divided into a grid of logarithmically spaced cells within which planet occurrence 
is computed. Only stars in the "solar subset" (see selection criteria in Table [TJ were used to compute occurrence. Cell color indicates 
planet occurrence with the color scale on the top in two sets of units, occurrence per cell and occurrence per logarithmic area unit. White 
cells contain no detected planets. Planet occurrence measurements are incomplete and likely contain systematic errors in the hatched 
region (i?p < 2 -Rgj). Annotations in white text within each cell list occurrence statistics: upper left — the number of detected planets 
with SNR > 10, Hpi,cGll) S'lid in parentheses the number of augmented planets correcting for non-transiting geometries, n.pi,aug,ccli; lower 
left — the number of stars surveyed by Kepler around which a hypothetical transiting planet with Rp and P values from the middle of the 
cell could be detected with SNR > 10; lower right — /colli planet occurrence, corrected for geometry and detection incompleteness; upper 
right — d? f /dlogm P/dlogj^Q Rp, planet occurrence per logarithmic area unit (dlogjQ P dlogj^Q Rp = 28.5 grid cells). 
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is the a priori probability of a transiting orientation of 
planet j . Each individual planet is augmented in its con- 
tribution to the planet count by a factor of a/Ri, to ac- 
count for the number of planets with similar radii and 
periods that are not detected because of non-transiting 
geometries. For each planet, its specific value of (a/i?*)j 
is used, not the average a/Rj, of the cell in which it re- 
sides. Each scaled semi-major axis {a/Ri,)j is measured 
directly from Kepler photometry and is not the ratio of 
two quantities, aj and separately measured with 

lower precision. In the denominator, n^.j is the number 
of stars whose physical properties and photometric sta- 
bility are sufficient so that a planet of radius i?pj and 
period Pj would have been detected with SNR > 10 as 
defined by equation ([1]). Note that our requirement for 
SNR > 10 is applied to the numerator (the planets that 
count toward the occurrence rate) and the denominator 
(the stars around which those planets could have been 
detected) of equation 

While Figure |4] does not show error estimates for /coii, 
we compute them with binomial statistics and use them 
in the analysis that follows. We calculate the binomial 
probability distribution of drawing npi^ccU planets from 
?i*,cff,ccii = »^pi,coii//ceii "effective" stars. The ±lcr errors 
in /cell are computed from the 15.9 and 84.1 percentile 
levels in the cumulative binomial distribution. Note that 
'T-pi.coU is typically a small number (in Figure |4l ripi^ccii 
has a range of 1-36 detected planets) so the errors within 
individual cells can be significant. These errors and the 
corresponding occurrence fluctuations between adjacent 
cells average out when cells are binned together to com- 
pute occurrence as a function of radius or period. Also 
note that our error estimates only account for random 
errors and not systematic effects. 

Figure |4] contains numerical annotations to help digest 
the wealth of planet occurrence information. In the lower 
left of each cell is ?T.*,mid-ceii, the number of Kepler tar- 
gets with sufficient ctcdpf such that a central transit of 
a planet with i?p and P values from the middle of the 
cell could have been detected with SNR > 10. Above 
this, we list ripi^coii followed by npi^aug,ccii in parentheses, 
ripi^aug.ceii is the total extrapolated number of planets in 
each cell after correcting for the a priori transit proba- 
bility for each planet, 

"pi 

npl,aug,ccll = 

The annotation in the lower right of each cell is 
/coll. The reader can quickly check that planet occur- 
rence is computed correctly by verifying that /coii ~ 
"■pi,aug,coii/f^*,mid-coii; planet occurrencc is the ratio of 
the number of planets to the number of stars searched 
Finally, the annotation in the top right of each cell is 

This approximate expression for /cell breaks down in cells 
where the number of stars with SNR > 10 (".*) varies rapidly across 
the cell. Equation |(2} computes planet occurrence locally using 
for the specific radius and period of each detected planet, while 
the n^, mid— cell listed in the annotations applies to i?p and P in 
the middle of the cell. Thus, planet occurrence is more poorly 
determined in regions of Figure [4] where the detection complete- 
ness varies rapidly and/or the detected planets are clustered in 
one section of the cell. These poorly measured regions typically 
have Rp < 2 and longer orbital periods. 



TABLE 3 

Planet Multiplicity vs. Planet Size 



Fraction of planet hosts with a second planet . . . 
Rp (-R®) in same Rp range within ^i?,p-2/?p with any Rp 



1.0-1.4 0.05 0.16 0.26 

1.4-2.0 0.09 0.25 0.27 

2.0-2.8 0.08 0.23 0.25 

2.8-4.0 0.12 0.28 0.30 

4.0-5.6 0.04 0.09 0.13 

5.6-8.0 0.04 0.09 0.13 

8.0-11.3 0.00 0.06 0.06 

11.3-16.0 0.00 0.00 0.06 

16.0-22.6 0.00 0.00 0.00 



/cell in units of occurrence per d \ogiQ P d logj^g Rp (oc- 
currence per factor of ten in i?p and P), a unit that is 
independent of the choice of cell size. There are 28.5 
grid cells per unit of dlog^pF dlogiQ Rp] that is, a re- 
gion whose edges span factors of ten in i?p and P has 
28.5 grid cells of the size shown in Figure [H Each cell 
spans a factor of a/2 in Rp and a factor of 5^^'^ in P. 

The distribution of planet occurrence in Figure H] offers 
remarkable clues about the processes of planet formation, 
migration, and evolution. Planet occurrence increases 
substantially with decreasing planet radius and increasing 
orbital period. Planets larger than 1.5 times the size of 
Jupiter {Rp > 16 R^) are extremely rare. Planets with 
P < 2 days are similarly rare. Because of incompleteness, 
we tread with caution for planets with Rp = 1-2 i?^, 
but note that these planets have a occurrence similar 
to planets with Rp = 2-4 Rq. Their actual occurrence 
could be higher due to incompleteness of the pipeline at 
identifying the smallest planets or lower due to a higher 
rate of false positives. 

Planet multiplicity complicates our measurements of 
planet occurrence. We interpret /ceii as the fraction of 
stars having a planet in the narrow range of P and Rp 
that define a particular cell. With few exceptions, stars 
are not orbited by planets with nearly the same radii 
and periods. However, when we apply equation ([2|) to 
larger domains of the radius-period plane, for example 
by marginalizing over P (Section 13. ip or over Rp (Sec- 
tion 13. 2( ). the same star can be counted multiple times 
in equation ([2]) if multiple planets fall within that larger 
domain of Rp and P. Thus, our occurrence measure- 
ments are actually of the mean number of planets per 
star meeting some criteria, rather than than fraction of 
stars having at least one planet that meet those criteria. 
When the rate of planet multiplicity within a domain is 
low, these two quantities are nearly equal. 

The 438 planets in our solar subset of stars (Table [1} 
orbit a total of 375 stars. The fraction of planets in 
multi-transiting systems is 0.27 and the fraction of host 
stars with multiple transiting planets is 0.15. In Table 
[3] we list three measures of planet multiplicity for the 
planetary systems within the solar subset (Table [T]). For 
each of the Rp ranges in Figure H] we list the fraction of 
hosts stars with more than one planet in the specified 
Rp range, the fraction of hosts with one planet in the Rp 
range and a second planet with a radius within a factor of 
two of the first planet's, and the fraction with one planet 
in the Rp range and a second planet having any Rp. 

Table |3] suggests that multiplicity is common. 



Planet Occurrence from Kepler 



11 



iLissauer at alj (|2011b[ ) noted that the multi-planet sys- 
tems observed by Kepler have relatively low mutual incli- 
nations (typically a few degrees) suggesting a significant 
correlation of inclinations. Converting our measurements 
of the mean number of planets per star to the fraction of 
stars having at least one planet requires an understand- 
ing of the underlying multiplicity and inc hnation distri- 
butions . Such an analysis is attempted bv ILissauer et al.) 
(|2011b[ ). but is beyond the scope of this paper. 

It is worth identifying additional sources of error and 
simplifying assumptions in our methods. The largest 
source of error stems directly from 35% rms uncertainty 
in from the KIC, which propagates directly to 35% 
uncertainty in Rp. We assumed a central transit over 
the full stellar diameter in equation ([2]). For randomly 
distributed transiting orientations, the average duration 
is reduced to 7r/4 times the duration of a central transit. 
Thus, this correction reduces our SNR in equation ([T|) by 
a factor of ^/ttJa, i.e. a true signal-to-noise ratio thresh- 
old of 8.8 instead of 10.0. This is still a very conservative 
detection threshold. Additionally, our method does not 
account for the small fraction of transits that are graz- 
ing and have reduced significance. We assumed perfect 
y/i scaling for ctcdpp values computed for 3 hr intervals. 
This may underestimate ctcdpp for a 6 hr interval (ap- 
proximately the duration of a P = 50 day transit) by 
~10%. These are minor corrections and affect the nu- 
merator and denominator of equation ^ nearly equally. 

3.1. Occurrence as a Function of Planet Radius 

Planet occurrence varies by three orders of magnitude 
in the radius-period plane (Figure |4]). To isolate the de- 
pendence on these parameters, we first considered planet 
occurrence as a function of planet radius, marginalizing 
over all planets with P < 50 days. We computed oc- 
currence using equation ^ for cells with the ranges of 
radii in Figure |4] but for all periods less than 50 days. 
This is equivalent to summing the occurrence values in 
Figure |3] along rows of cells to obtain the occurrence for 
all planets in a radius interval with P < 50 days. The 
resulting distribution of planet radii (Figure [S|) increases 
substantially with decreasing Pp. 

We modeled this distribution of planet occurrence with 
planet radius as a power law of the form 



d/(P) 
dlogP 



= fcffP". 



(4) 



Here d/(P)/dlogP is the mean number of planets hav- 
ing P < 50 days per star in a log^Q radius interval cen- 
tered on R (in P®), kn is a normalization constant, and 
a is the power law exponent. To estimate these param- 
eters, we used measurements from the 2-22.7 P0 bins 
because of incompleteness at smaller radii and a lack of 
planets at larger radii. W e fit equation ffl us ing a max- 
imum likelihood method pohnson et al.ll2010[ ). Each ra- 
dius interval contains an estimate of the planet fraction, 
Fi = d/(Pi)/dlogP, based on a number of planet de- 
tections made from among an effective number of target 
stars, such that the probability of Fi is given by the bi- 
nomial distribution 

P(i^,|npi,nnd)=^^?'(l-^.r" (5) 
where ripi is the number of planets detected in a spec- 
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Fig. 5. — Planet occurrence as a function of planet radius for 
planets with P < 50 days (black filled circles and histogram). The 
top and bottom panels show the same planet occurrence measure- 
ments on logarithmic and linear scales. Only GK stars consistent 
with the selection criteria in Table [T] were used to compute occur- 
rence. These measurements are the sum of occurrence values along 
rows in Figure |4] Estimates of planet occurrence are incomplete 
in the hatched region {Rp < 2 i?®). Error bars indicate statistical 
uncertainties and do not include systematic effects, which are par- 
ticularly important for Rp < 2 ijg). No planets with radii of 22.6- 
32 i?0 were detected (see top row of cells in Figure[4]l . A power law 
fit to occurrence measurements for Rp = 2-22.6 (red filled cir- 
cles and dashed line) demonstrates that close-in planet occurrence 
increases substantially with decreasing planet radius. 

ified radius interval (marginalized over period, n^^^ = 
n-p\/ fccW — n-pi is the effective number of non-detections 
per radius interval, and /cou is the estimate of planet oc- 
currence over the marginalized radius interval obtained 
from equation The planet fraction varies as a func- 
tion of the mean planet radius R-p^i in each bin, and the 
best-fitting parameters can be obtained by maximizing 
the probability of all bins using the model in equation 



C=\{p{F{Rp^,)). 



(6) 



In practice the likelihood becomes vanishingly small away 
from the best-fitting parameters, so we evaluate the log- 
arithm of the likelihood 



"b: 



ln/: = _^lnp(P(Pp,,)) 
1=1 



(7) 
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0.1000 



i=l 



We calculate ln£ over a uniform grid in kji and a. The 
resulting posterior probability distribution is strongly co- 
variant in a and fc^. Marginalizing over each parame- 



ter, we find a = -1.92 ± 0.11 and kr/ = 2.9 



+0.5 



where 



the best-fit values are the median of the marginalized 1- 
dimensional parameter distributions and the error bars 
a re the 15.9 and 84.1 p ercentile levels. 

iHoward et all (|2010( ) found a power law planet mass 
function, d//dlogM = k'M"\ with k' = 0.39j:^:i^ and 



= -0.48 



Mp sin i 



+0.12 
-0.14 



for periods P < 50 days and masses 
3-1000 Mq. We explore planet densities and 
the mapping of Rp to Mp sin i in Section [5] 

3.2. Occurrence as a Function of Orbital Period 

We computed planet occurrence as a function of orbital 
period using equation ([2|). We considered this period 
dependence for ranges of planet radii {Rp = 2-4, 4-8, and 
8-32 Rq). This is equivalent to summing the occurrence 
values in Figure |3] along two adjacent columns of cells to 
obtain the occurrence for all planets in specified radius 
ranges. Figure H] shows that planet occurrence increases 
substantially with increasing orbital period, particularly 
for the smallest planets with Rp = 2-4 R^ . 

For P < 2 days, planets of all radii in our study (>2 
Rq) are extremely rare with an occurrence of < 0.001 
planets per star. Extending to slightly longer orbital pe- 
riods, hot Jupiters (P < 10 days, Rp = 8-32 Rq) are also 
rare in the Kepler survey. We measure an occurrence of 
only 0.004 ± 0.001 planets per star, as listed in Table HI 
That occurrence value is based on Kp < 15 and the other 
restrictions that define of the "solar subset" (Table [T]). 
Expanding our stellar sample out to Kp < 16, but keep- 
ing the other selection criteria constant, we find a hot 
Jupiter occurrence of 0.005 ±0.001 planets per star. This 
fraction is more robust as it is less sensitive to Poisson 
errors and our concern about detection incompleteness 
for Kp > 15 vanishes for hot Jupiters that typ ically pro- 
duce SNR > 1000 signals. iMarcv et all H2005aD found an 
occurrence of 0.012 ± 0.001 for hot Jupiters (a < 0.1 AU, 
P < 12d) around FGK dwarfs in the Solar neighborhood 
(within 50 pc) . Thus, the occurrence of hot Jupiters in 
the Kepler field is only 40% that in the Solar neighbor- 
hood. One might worry that our definition of Rp > 8 
P® excludes some hot Jupiters detected by RV surveys. 
For Kp < 16 and the same T^g and logg criteria, we find 
an occurrence of 0.0076 ± 0.0013, which is still 40% lower 
than the RV measurement. 

However, we do see modest evidence among the Ke- 
pler giant planets of the pile-up of hot Jupiters at or- 
bital periods near 3 days (Figures H] and [51) as is dramat- 
ically obvious fro m Doppler surveys of stars in the So - 
lar neighborhood ( Marcv et aO2008l : I Wright et al.ll2009D . 
These massive, close-in planets are detected with high 
completeness in both Doppler and Kepler techniques (in- 
cluding the geometrical factor for Kepler)^ so the differ- 
ent occurrence values are real. We are unable to ex- 
plain this difference, although a paucity of metal-rich 
stars in the Kepler sample is one possible explanation. 
Unfortunately, the metallicities of Kepler stars from 
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Fig. 6. — Planet occurrence (top panel) and cumulative planet 
occurrence (bottom panel) as a function of orbital period. The 
occurrence of planets with radii of 2-32 ijgj (black), 2-4 ijgj (or- 
ange), 4-8 -R0 (green), 8-32 i?0 (blue) are each depicted. Only 
stars consistent with the selection criteria in Table [l] were used to 
compute occurrence. Occurrence for planets with Rp < 2 is 
not shown due to incompleteness. The lower panel (cumulative 
planet occurrence) is the sum of occurrence values in the top panel 
out to the specified period. 

KIC photometry a re inadequate to test this hypothesis 
(|Brown et al.ll20ljl ) . A future spectros copic study of Ke- 
pler st ars with LTE analysis similar to lValenti fc Fischeii 
(j2005l ) offers a possible test. In additional to the metal- 
licity difference, the stellar populations may have differ- 
ent Tpff distributions , despite having similar Toff ranges. 
iJohnson et al.l ()2010i) found that giant planet occurrence 
correlates with both stellar metallicity and stellar mass 
(for which Tcs is a proxy). A full study of the occur- 
rence of hot Jupiters is beyond the scope of this paper, 
but we note that other photometric surveys for tran- 
siting hot Jupiters orbiting stars outside of the stellar 
neighborhood have rneasured reduced planets occurrence 
(IGilliland et al.l[2Q00l: IWeldrake et al.l^OOSt iGouId et al.l 

The occurrence of smaller planets with radii Rp = 2-4 
P® rises substantially with increasing P out to ^10 days 
and then rises slowly or plateaus when viewed in a log-log 
plot (orange histogram, top panel of FigurelB]). Out to 50 
days we estimate an occurrence of 0.130 ± 0.008 planets 
per star. Small planets in this radius range account for 
approximately three quarters of the planets in our study. 
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TABLE 4 

Planet Occurrence for GK Dwarfs 



-Rp(-R®) 


P < 10 days 


P < 50 days 


2-4 R® 


0.025 ± 0.003 


0.130 ±0.008 


4-8 i?e 


0.005 ± 0.001 


0.023 ± 0.003 


8—32 


0.004 ± 0.001 


0.013 ± 0.002 


2—32 


0.034 ± 0.003 


0.165 ±0.008 



TABLE 5 

Best-fit Parameters of Cutoff Power, Law Model 





Pp 

{R<s) 




/3 


Pq 
(days) 


7 


2 


-4 Pe 


0.064 ± 0.040 


0.27 ±0.27 


7.0 ± 1.9 


2.6 ± 0.3 


4 


-8 P® 


0.0020 ±0.0012 


0.79 ±0.50 


2.2 ± 1.0 


4.0 ± 1.2 


8 


-32 P© 


0.0025 ±0.0015 


0.37 ±0.35 


1.7± 0.7 


4.1 ± 2.5 


2 


-32 P© 


0.035 ± 0.023 


0.52 ±0.25 


4.8 ± 1.6 


2.4 ±0.3 



corrected for incompleteness. 

The occurrence distributions in the top panel of Figure 
[6] have shapes that are more complicated than simple 
power laws. Occurrence falls off rapidly at short periods. 
We fit each of these distributions to a power law with an 
exponential cutoff, 

M^=fcpP^fl_e-WPorV (8) 
dlogP V ) ^ ' 

This function behaves like a power law with exponent 
/3 and normalization kp for P ^ Pq. For periods P 
(in days) near and below the cutoff period Pq; f{P) f^^Us 
off exponentially. The sharpness of this transition is gov- 
erned by 7. Thus the parameters of equation ([S]) measure 
the slope of the power law planet occurrence distribution 
for "longer" orbital periods as well as the transition pe- 
riod and sharpness of that transition. 

We fit equation ([8]) to the four ranges of radii shown 
in Figure [6] (top panel) and list the best-fit parame- 
ters in Table [5] We note that /? > for all planet 
radii considered, i.e. planet occurrence increases with 
log P. For the largest planets (Pp = 8-32 P©), /? 
= 0.37 ± 0.35 is consistent with the power l aw oc cur- 
rence distribution derived by iCumming eiTan (HqoI) for 
gas giant planets with periods of 2-2000 days, d/ oc 

M~"-31±0-2p0.26±0.1 d log M d log P 

Po and 7 can be interpreted as tracers of the migration 
and stopping mechanisms that deposited planets at the 
closest orbital distances. With decreasing planet radius, 
Pq increases and 7 decreases, shifting the cutoff period 
outward and making the transition less sharp. Thus, 
gas giant planets (Pp = 8-32 P©) on average migrate 
closer to their host stars (Pq is small) and the stopping 
mechanism is abrupt (7 is large). On the other hand, 
the smallest planets in our study have a distribution of 
orbital distances (and periods) with a characteristic stop- 
ping distance farther out and a less abrupt fall-off close- 
in. 

The normalization constant kp is highly correlated 
with the other parameters of equation ([8]). A more ro- 
bust normalization is provided by the requirement that 
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Fig. 7. — Measured planet occurrence (filled circles) as a func- 
tion of orbital period with best-fit models (solid curves) overlaid. 
These models are power laws with exponential cutoffs below a char- 
acteristic period, Pq (see text and equation [Sj . Po increases with 
decreasing planet radius, suggesting that the migration and park- 
ing mechanism that deposits planets close-in depends on planet 
radius. Colors correspond to the same ranges of radii as in Figure 
[B] The occurrence measurements (filled circles) are the same as in 
Figure [B] however for clarity the 2-32 P® measurements and fit 
are excluded here. As before, only stars in the solar subset (Table 
[TJ and planets with Pp > 2 P® were used to compute occurrence. 



the integrated occurrence to P = 50 days is given in 
Tabled 

4. STELLAR EFFECTIVE TEMPERATURE 
4.1. Planet Occurrence 

In the previous section we considered only GK stars 
with properties consistent with those listed in Table [TJ 
In particular, only stars with Toff = 4100-6100 K were 
used to compute planet occurrence. Here we expand this 
range to 3600-7100 K and measure occurrence as a func- 
tion of Tcff. This expanded set includes stars as cool as 
MO and as hot as F2. For Tcff outside of this range there 
are too few stars to compute occurrence with reasonable 
errors. We use the same cuts on brightness (Kp < 15) 
and gravity (log g = 4.0-4.9) as before. We also used the 
photometric noise acopp values (as before) to compute 
the fraction of target stars around which each detected 
planet could have been detected with SNR > 10. This en- 
sures that planet detectability down to sizes of 2 P© will 
be close to 100%, for all of these included target stars 
independent of their T^s- 

We computed planet occurrence using the same tech- 
niques as in the previous section, namely equation ([2]). 
We subdivided the stars and their associated planets into 
500 K bins of Tcff- We further subdivided the sample by 
planet radius, considering different ranges of Pp (2-4, 4- 
8, 8-32, and 2-32 P©) separately. In summary, we com- 
puted planet occurrence as a function of Toff for several 
ranges of Pp and in all cases we considered all planets 
with P < 50 days. 

Figure [5] shows these occurrence measurements as a 
function of Toff- Most strikingly, occurrence is inversely 
correlated with Teff for small planets with Pp = 2-4 P© . 
Fitting the occurrence of these small planets in the Toff 
bins shown in Figure [8l we find that a model linear in 
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■ 1081 stars 1761 stars 6009 stars 18865 stars 31369 stars 1 1795 stars 2299 stars 
MO K5 K2 KO G8 G5 G2 GO F8 F5 F2 




3600 4100 4600 5100 5600 6100 6600 7100 
Stellar Effective Temperature, T,„ (K) 

Fig. 8. — Planet occurrence as a function of stellar effective tem- 
perature Teff. Histogram colors refer to planets with the same 
ranges of radii as in Figure[6] Here we consider planets with P < 50 
days and expand beyond the solar subset to T^g = 3600-7100 using 
the same cuts in logg (4.0-4.9) and Kp (< 15) to select bright main 
sequence stars. We include only target stars for which photometric 
noise permits the detection of 2 planets, correcting for reduced 
detectability of small planets around the larger, hotter stars. The 
occurrence of small planets (Rp = 2-4 R^, orange histogram) rises 
substantially with decreasing T^g. The best- fit linear occurrence 
model for these small planets is shown as a red line. The number 
of stars in each tem perature bin is li sted at the top of the figure. 
MK spectral types l ICox. A. N.II200Cf) for main sequence stars are 
shown for reference. 



TeS, 



/(Toff) - /o 



5100 K 



1000 K 



(9) 



fits the data well. Using linear least-squares, the best-fit 
coefficients are /o = 0.165 ± 0.011 and kr = -0.081 ± 
0.011 and the relation is valid over T^s = 3600-7100 K. 
We adopted a linear model because it is simple and pro- 
vides a satisfactory fit with a reduced of 1.03. How- 
ever, we caution that the occurrence measurements in 
the three coolest bins have relatively large errors and 
are consistent with a fiat occurrence rate, independent 

ofTeff. 

The occurrence of planets with radii larger than 
4 i?0 does not appear to correlate with Tcs (Figure |8]), 
although detecting such a dependence would be challeng- 
ing given the lower occurrence of these planets and the 
associated small number statistics in our restricted sam- 
ple. 

4.2. Sources of Error and Bias 

The correlation between the occurrence of 2-4 
i?0 planets and T^s is striking. In this subsection we 
consider three possible sources of error and/or bias that 
could have spuriously produced this result. First, we rule 
out random errors in the occurrence measurements or in 
the stellar parameters in the KIC. Next, we consider a 
systematic bias in i?^, but conclude that any such bias 
will be too small to cause the correlation. Finally, we 
consider a systematic metallicity bias as a function of 
Tcff- While we consider this unlikely, we cannot rule it 
out as the cause of the observed correlation. 

4.2.1. Random Errors 



One might worry that the fit to equation ([SJ is driven 
by fluctuations due to small number statistics in the 
coolest temperature bins. The monotonic trend of ris- 
ing planet occurrence from 7100 K to 4600 K is less clear 
for the two coolest bins with Toff = 3600-4600 K. The 
coolest Toff bin, 3600-4100 K, contains only six detected 
planets and carries the largest uncertainty of any bin. 
The 4100-4600 K bin contains 13 detected planets. As a 
test we excluded the hottest and coolest Tes bins and fit 
equation ([9|) to the remaining occurrence measurements 
(4100-6600 K). The best-fit parameters were unchanged 
to within 1-a errors. 

Next, we checked to see if random or systematic errors 
in stellar parameters could cause the correlation of 2-4 
i?0 planet occurrence with T^e- The key stellar param- 
eters from the KIC are Tes and log 5 which have RMS 
errors of 135 K and 0.25 dex, respectively. Stellar radii 
carry fractional errors of 35% RMS stemming from the 
log (7 uncertainties. 

Using a Monte Carlo simulation, we assessed the im- 
pact of these random errors in the KIC parameters on 
the noted correlation. In 100 numerical realizations, we 
added gaussian random deviates to the measured Tcff 
and log (7 values for every star in the KIC. These random 
deviates, Alogg and ATofT, had RMS values equal to the 
RMS errors of their associated variables (135 K and 0.25 
dex) . Using the new log g values we updated i?^ for every 
star using i?*,ncw = ^,t,oidlO'^'°^^/^. Planet radii, Rp, 
were updated in proportion to the change in i?^ for their 
host stars. With each simulated KIC we performed the 
entire analysis of this section: we selected KIC stars that 
meet the Toff, logg, and Kp criteria, divided those stars 
into 500 K subgroups, computed the occurrence of Rp = 
2-4 i?0 planets in each Tcff bin using the perturbed Rp 
values, and fit a linear function to the occurrence mea- 
surements in each Tes bin yielding /o and kx- The stan- 
dard deviations of the distributions of /o and kx from 
the Monte Carlo runs are 0.011 and 0.009, respectively. 
These uncertainties are nearly equal to the statistical un- 
certainties of /o and kx quoted above that are derived 
from the binomial uncertainty of the number of detected 
planets within each Tag bin. Thus, our quoted errors on 
/o and kx above probably underestimate the true errors 
by ~ v^- We conclude that the correlation between Tcff 
and the occurrence of 2-4 R^ planets is not an artifact 
of random errors in KIC parameters. 

4.2.2. Systematic 7?* Bias? 

Potential systematic errors in the KIC parameters 
present a greater challenge than random errors. We as- 
sessed the impact of systematic errors by considering the 
null hypothesis — that the occurrence of 2-4 R(^ plan- 
ets is actually independent of Toff — and determined how 
large the systematic error in i?*(Teff) would have to be 
to produce the observed correlation of occurrence with 
TcS (equation [9]). That is, systematic errors have to ac- 
count for the factor of 7 increase in the occurrence of 2-4 
i?® planets between the Toff = 6600-7100 K and 3100- 
3600 K bins. In this imagined scenario, the photometric 
determination of log g in the KIC has a systematic error 
that is a function of T^s- This systematic error causes 
corresponding errors in i?^ and ultimately Rp that de- 
pend on Teg. We assumed that the power law radius 
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distribution measured in Section 13.11 is independent of 
Tcff and that it remains valid for i?p < 2 Rq. Then 
the systematic error in Rp would shift the bounds of 
planet radius in each T^g bin. That is, in the lowest 
Tcff bin (3100-3600 K), while we intended to measure 
occurrence for planets with radii 2-4 i?^, we actually 
measured occurrence over a range of smaller radii, (2-4 
Rq)/S, where the occurrence rate is intrinsically higher. 
Here, 5 is a dimensionless scaling factor that describes 
the size of the systematic Rp error in the Toff — 3100- 
3600 K bin. Similarly, for the Toff = 6600-7100 K bin we 
intend to measure the occurrence of 2-4 R^ planets, but 
instead we measure the occurrence of planets with Rp 
= S'-(2-4 R^) because of systematic errors in i?p(Tcff ) oc 
i?*(Toff). Using the power law dependence for occurrence 
with Rp (equation 111), we find that S = (1/7)"/^ = 6.2 
for the systematic error in Rp{Tcs) to cause a factor of 
7 occurrence error between the coolest and hottest Teff 
bins. A factor of 6.2 error in R^, corresponds to a logg 
error of 1.6 dex and is akin to mistaking a subgiant for a 
dwarf. Surely systematic errors in R^, and log g from the 
KIC are smaller than this. The KIC was constructed al- 
most entirely for the purpose of selecting t argets for the 
planet search by excluding evolved stars. iBrown et al.l 
()2Qllt ) compared the log g values from the KIC and LTE 
spectral synthesis of Keck- HIRES spectra and found that 
only one star out of 34 tested had a log g discrepancy of 
greater than 0.3 dex (see their Figure 8). We reject the 
null hypothesis and conclude that the strong correlation 
between the occurrence of 2-4 R^ planets and Tcff is real. 

4.2.3. Systematic Metallicity Bias? 

Another potential bias stems from the metallicity gra- 
dient as a function of height above the galactic plane 
(pBcnsby et al. 2007; Neves ct al. 2009). The Kepler field 
sits just above the galactic plane, with a galactic latitude 
range b = 6-20 degrees. The most luminous and hottest 
stars observed by the magnitude-limited Kepler survey 
are on average the most distant. Because of the slant 
observing geometry, these stars also have the greatest 
height above the galactic plane. Likewise, the least lu- 
minous and coolest stars observed by Kepler are closer to 
Earth and only a small distance above the plane. Given 
that the average metallicity declines with distance from 
the galactic plane, one might expect that the hottest 
stars have lower metallicity, on average, than the coolest 
stars observed by Kepler. 

This hypothesis suggests a key test: does the occur- 
rence of 2-4 Rq planets depend on [Fe/H]? Unfortu- 
nately we are not able to perform this test using stellar 
parameters from the KIC. While Tcff values are accu- 
rate to 135 K (rm s), [Fe/H] values are of poor quality. 
IBrown et al.l mV^ found [Fe/H] errors of 0.2 dex (rms), 
and possibly higher due to systematic effects. Thus, the 
[Fe/H] values from the KIC are not helpful in testing 
the hypothesis that the occurrence of 2-4 i?^ planets 
depends on metallicity. 

To get a sense of the size of the metallicity gradient as a 
function of Toff, we simulated our magnitude-limited ob- 
servations of t he Kepler field usi ng the Besancon model 
of the galaxy (|Robin et al.ll2003l ). This simulation pro- 
duced a synthetic set of stars (with individual values of 
Tcff, \ogg, [Fe/H], M^,, etc.) based on the coordinates of 
the Kepler field. We computed the median [Fe/H] for 



the seven Teff bins in Figure |S] and found, from coolest to 
hottest, [Fe/H] (median) = -0.02, -0.03, -0.03, -0.06, 
—0.07, -1-0.01, -1-0.04. The somewhat surprising upturn 
in metallicity in the two hottest Tcff bins appears to be 
due to an age dependence with Tcff; younger stars are 
more metal rich. The two hottest bins have a median 
age of 2 Gyr, while the five cooler Teff bins have median 
ages of 4-5 Gyr. We conclude based on this synthetic 
galactic model that [Fe/H] varies by perhaps ~0.1 dex 
over our Tcff range and that the dependence need not be 
monotonic due to age effects. 

It is also worth considering how large of an [Fe/H] gra- 
dient is needed to increase giant planet occurrence by 
a factor of seven. Clearly, occurrence trends for jovian 
planets and 2-4 i?^ planets need not be similar, but these 
larger planets offer a sense of scale tha n may be relevant 
for smaller planets. For giant planets, iFischer fc Valentil 
(|2005l ) found that occurrence scales as cx 10^ °P''/^1, 
while I Johnson et al.l (fMol) found cx 101-2[f°/h]^ after ac- 
counting for the occurrence dependence on M^,. These 
scaling relations suggest that [Fe/H] gradients of 0.4-0.7 
dex are needed to affect a factor of seven change in occur- 
rence. A metallicity change of only ~0.1 dex among 2-4 
i?0 planet hosts seems unlikely to change planet occur- 
rence by the amount we observed. Further, if the occur- 
rence of such planets depends so sensitively on [Fe/H], 
it seems likely that Doppler surveys of them would have 
detected this trend among the ^30 RV-detected planets 
with MpSini < MNeptuno- 

The possibility that increased metallicity correlates 
with increased 2-4 Rq planet occurrence contradicts ten- 
tative trends of low-mass planets observed by Doppler 
surveys. Valcnti (2010) noted that among the host stars 
of Doppler-detected planets, those stars with only plan- 
ets less massive than Neptune are metal poor relative to 
the Sun. This tentative threshold is intriguing, but it 
only shows that the distribution of detected planets has 
an apparent [Fe/H] threshold, not that the occurrence 
of these planets depends systematically on [Fe/H]. To 
interpret the threshold physically, one needs to check for 
metallicity bias in the population of Doppler target stars. 

5. PLANET DENSITY 

It is tempting to extract constraints on the densities of 
small planets by comparing the distribution of radii mea- 
sured by Kepler to the distribution of minimum masses 
(MpSinj) measured by Doppler- detected planets from 
surveys of the Solar neighborhood (jCumming etalll200l 
iHoward et aIll2010D . This effort may be partially com- 
promised by the different populations of targets stars, 
despite our efforts to select stars with similar log 5 and 
Tcff distributions. The Kepler target stars are typically 
^-^50-200 pc above the Galactic plane while Doppler tar- 
get stars reside typically within 50 pc of the Sun near the 
plane. Indeed in Section 15^ we saw that the hot Jupiter 
occurrence was 2.5 times lower in the Kepler survey than 
in the Doppler surveys, suggesting a difference in stellar 
populations, possibly related to the decline in metallicity 
with Galactic latitude and/or differing Tcff distributions. 
Nonetheless, one should not ignore the opportunity to 
search for information from combining the Kepler and 
Doppler planet occurrences, with caveats prominently in 
mind. 

We first consider known individual planets that 
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have measured raasses, radii, and implied bulk densi- 
ties. Placing these well-measured planets on theoreti- 
cal mass-radius relatio nships ( e.g., Valencia et al. '20061; 
Seager ct al. 2007; Soti n et al.i [2007: Baraffe et al. 2008 



Grasset et al.l i2009) provides insight into the range of 



compositions encompassed by the detected planets. Our 
goal is to complement these few well-studied cases with 
statistical constraints on the planet density distribution. 



5.1. Known Planets 

We begin by considering the known planets with Rp 
and Mp > 0.1 Ma 



< 



8 Rq and Mp > 0.1 Mq. This range of parameters se- 
lects planets smaller than Saturn and as large or larger 
than Mars. Figure |9] shows all such planets with good 
mass and radius measurements from our solar system 
and other systems. The oretical calculations of Kepler- 
10b (Bata lha et al.l[2"011l ) based on its mass and radius 
(4.5 Af© and 1.4 i?©) suggest a rock/iron composition 
with little or no wate r. Corot-7b ha s a ra dius of 1.7 
i?© ()Leger et al.ll2009D . iQueloz et ahl ()2009[) measured 
a mass of 4.8 M© for this planet, implying a density 
of 5.6 gcm~^ and a rocky composition. However, the 
mass and density have remained controversial. Inde- 
pendent mass determinations based on the same spot- 
contaminated Do ppler data yield rnasses that vary by 
a factor of 2-3 (iPont et al.l 120111: iHatzes et afl 120101: 
iFerraz-Mello et al.''2010'). We adopt the mass estimate 
of 1-4 M© from Pont ct al. (2011), which implies a wide 
range of possible compositions and also marginally fa- 
vors a water/ice-dominated planet. GJ 1214b is a less 
dense super-Earth orbiting an M dwarf. The planet has 
been modeled as a solid core surrounded by H/He/H20 
and may be intermediate in composition between ice gi- 
ants like Uran us and Ne ptune and a 50% water planet 
(|Ncttelmann e t al.l I2010[ ). The discovery of the six 
co-planar planets orbiting Kepler- 11 added five plan- 
ets with measured masses (from transi t-timing varia- 
tions) to Figure O ()Lissauer eFaI1l2011aD . The remain- 
ing exoplanets in Figure [9] all have masses greater than 
Neptune's (17 M©) and densi ties less t han 2 gcm~^: 
Kepl er-4b (Borucki et al. 2010b), Gl 436b (Maness_eLan 
[20071: fGillo n ct al. 2007; Torres et al..,2008'). HAT-P-ll b 
(|Bakos et a l. 2010af). HAT-P-26b (iHartman et al.ll20TTI). 
Corot -8b teorde etall[2010l) . HP 149026b (ISato et al.l 
[200a iTorres et al.ll2008DT 

Figure |9] shows that among known planets their radii 
increase with planet mass faster than do any of the the- 
oretical curves representing solid compositions of iron, 
rock, or ice. This rapid increase in radius with mass sug- 
gests that planets of higher mass contain larger fractional 
amounts of H/He gas. The slope increases markedly for 
masses above 4.5 M©, indicating that above that planet 
mass the contribution of gas is common, even for these 
close-in planets. Apparently planets above 4.5 M© are 
rarely solid. We suspect that for planets orbiting beyond 
0.1 AU where coUisional stripping of the outer envelope 
is less energetic and common, the occurrence of gaseous 
com ponents will be g reater. 

Fortnev et al.l (|2007b) modeled solid exoplanets com- 
posed of pure water ("ice"), rock (Mg2Si04), iron, and 
binary admixtures. Their models include no gas compo- 
nent and are shown as gray lines in Figure [HI Adding 
gas to any of the models increases Rp and decreases p 
(p^dams et al...2008.) . Thus planets below and to the right 
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mass and the lower panel shows density as a function of radius. 
Solar system planets (Mars, Venus, Earth, Uranus, and Neptune) 
are depicted as blue triangles. Extrasolar planets (filled circles) 
are colored red for Kepler discoveries and black for discoveries by 
other programs. Solid gray lines indicate the densities of solid 
planets composed o f pure ice, pure rock, and pure iron using the 
IFortnev et al.l l l2007 a.b') models. Dotted gray lines show the densi- 
ties of admixture compositions (from bottom to top in lower panel): 
67/33% ice/rock, 33/67% ice/rock, 67/33% rock/iron, 33/67% 
rock/iron. 



of the ice contour (Figure ^ lower panel) have low den- 
sities due to a gas component. Planets above the ice 
contour contain increasing fractions of rock and iron, de- 
pending on the specific system. Compositional details 
matter greatly for specific systems, but for our simple 
purpose we make the crude approximation that planets 
with i?p < 3 i?© that have p > 4 gcm"'^ are composed 
substantially of refractory materials (usually rock in the 
form of silicates and iron/nickel). These planets may 
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have some water and gas, but those components do not 
dominate the planet's composition as they do for Uranus, 
Neptune, and larger planets. 

5.2. Mapping Kepler Radii to Masses 

The Eta-Earth Survey measured planet occurrence as 
a function of Mp sin i in a volume-limited sample of 
166 G and K dwarfs using Doppler measurements from 
Keck-HIRES. The stars have a nearly unbiased metallic- 
ity distribution and are chromospherically quiet to en- 
able high Doppler precision. In all, 35 planets were 
detected around 24 of the 166 sta rs, including super- 
Earths and Neptune- mass planets (jHoward et al.l l2009l 
l2011allH ). Correcting for inhomogeneous sen sitivity at 
the lowest planet masses. iHoward et al.l (|2010( ) measured 
increasing planet occurrence with decreasing mass over 
five planet mass domains, MpSini = 3-10, 10-30, 30- 
100, 100-300, 300-1000 M®, spanning super-Earths to 
Jupiter-mass planets. This study was restricted to plan- 
ets with P < 50 days. 

We mapped the planet radius distribution from Ke- 
pler (Figure IH including planets down to 1 R^) onto 
mass (Mp sini) using toy density functions, p{Rp). These 
single- valued functions map all planets of a particular ra- 
dius, i?p, onto a planet mass Mp = 4:np{Rp)Rp/3. Of 
course, real planets exhibit far more diversity in radii for 
a given mass owing to different admixtures of primarily 
iron/nickel, rock, water, and gas. Nevertheless, the mod- 
els allow us to check if average masses associated with 
Kepler radii are consistent with Doppler measurements. 

As part of this numerical experiment we converted Mp 
to Mp sin i for each simulated planet using random or- 
bital orientations (inclinations i drawn randomly from 
a probability distribution function proportional to sini.) 
Our simulated Mp sin i distributions account for the tran- 
sit probabilities of planets detected by Kepler and the de- 
tection incompleteness for planets with small radii. That 
is, the simulated MpSini distributions reflect the true 
distribution of planet radii (Section [33J. 

Figure [10] shows simulated Mp sin i distributions as- 
suming several toy density functions. These distribu- 
tions are binn e d in t he same Mp sin i intervals as in the 
IHoward et al.l ()2010[ ) study. In the left column p{Rp) = 
Po, where po is a constant. From bottom to top, we 
considered four densities, po = 0.4, 1.35, 1.63, and 5.5 
gcm^'^ (the bulk densities of HAT-P-26b, Jupiter, Nep- 
tune, and Earth). We are most interested in the densities 
of small planets so we make comparisons in the two low- 
est mass bins for which Eta-Earth Survey measurements 
are available, MpSini = 3-10 and 10-30 M^. In these 
bins, the predicted occurrence from Kepler is too small 
by 1.5-2cr compared with the Eta-Earth Survey measure- 
ments for the three lowest constant density models, po = 
0.4, 1.35, and 1.63 gcm^'^. Kepler predicts fewer small 
planets than the Eta-Earth Survey measured. The simu- 
lated Mp sin i distribution matches the observed Mp sin i 
distribution well for an assumed density, p = 5.5 gcm~^. 
While this model is clearly unphysical when extended 
over the entire radius range, consistency in the two low- 
mass bins suggests that the small planets have higher 
densities. 

We explored slightly more complicated density func- 
tions in the right column of Figure 1101 These functions 



are piece-wise constant density models, with density ris- 
ing to 4.0, 5.5, and 8.8 gcm~^ for small radii, as depicted 
in the sub-panels of Figure [TOl (Kepler- 10b has a density 
of 8.8 gcm~ : iBatalha et alll201in We find the greatest 
consistency between the synthetic and measured mass 
distributions for two density models. One (model h) is 
shown in the upper right panel of Figure [10] which has p 
= 8.8, 5.5, 1.64, 1.33 gcm'^ for Rp = 1-1.4, 1.4-3.0, 3.0- 
6.0, and >6.0 i?®, respectively. This model has a high 
density (8.8 gcm~^) for the smallest planets but succes- 
sively smaller densities for larger planets, approximately 
consistent with the densities of known planets in Figure 
[9] The other successful model (g) has a density of 4 
gcm^'^ for the smallest planets, with declining densities 
for larger planets, qualitatively similar to the previous 
model (h). This model (g) also yields a predicted dis- 
tribution of Mp sin i that agrees well with the observed 
distribution of Mp sini. Thus it too is viable. Both suc- 
cessful models, g and h, are characterized by a high den- 
sity for the smallest planets of 1-3 Rq . We tried a variety 
of piecewise constant density functions and found that all 
models that achieved consistency (< la difference in the 
3-10 and 10-30 M^bins) have p > 4 gcm~3 for i?p < 3 

5.3. Conclusions 

The mapping of radius to mass offers circumstantial 
evidence that a substantial population of small plan- 
ets detected by Kepler have high densities. Rocky 
composition for the smallest planets supports the core- 
accretion model of p l anet formation (iPollack et allll996l : 
iLissauer et all 120091: iMovshovitz et al.l I2O1O0 " But we 
caution again that the stellar populations of the Ke- 
pler and Doppler surveys may be quite different. Planet 
multiplicity also makes this an especially challenging 
comparison. We computed the simulated Kepler mass 
distributions (black histograms in Figure [TU]) based on 
occurrence measured as the average number of planets 
per star while the Doppler results from the Eta-Earth 
Survey (red points in Figure llOp computed occurrence 
as the fraction of stars hosting at least one planet in 
the specified Mp sin i interval. This difference is based 
on intrinsic limitations of each approach. To infer the 
fraction stars with at least one planet from a transit sur- 
vey requires an assumption about the mutual inclinations 
(Lissauer et al. 2011b). For Doppler surveys, it is signifi- 
cantly easier to determine if a particular star has at least 
one planet down to some specified mass limit, but it is 
much more difficult to be sure that all planets orbiting 
a star have been det ected down to that same mass limit 
(IHoward et al.l [20101 ). Finally, we note that no planets 
at the extreme of our proposed high density regime (Rp 
~ 3 Rq and p ^ 4 gcm"'^) have been detected (Figure 
[TU|) . To date all detected planets with Rp > 2 i?^ have 
p < 2 gcm~^. We conclude that while this technique 
offers qualitative support for rising density with decreas- 
ing planet size, in practice extracting firm quantitative 
conclusions is difficult because of the intrinsic differences 
between Doppler and transit searches. 

6. DISCUSSION 
6.1. Methods 
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Fig. 10. — Predicted mass distributions (Mp sin i , black histogr a ms) ba sed on planet radii measured by Kepler and hypothetical planet 
density functions (blue lines in inset panels) . The IFortnev et al.l Il2007a| [br) theoretical density curves for solid planets from Figure |9] are 
plotted as solid and dotted gray lines in each inset panel. Planet occurrence measurements as a function of Mp sin i from the Eta-Earth 
Survey ijHoward et al.l [20101 ) are shown as red filled circles. Panels in the left column show the mass distributions resulting from toy 
constant-density models. From bottom to top (panels a-d), all planets have densities of 0.4, 1.35, 1.63, and 5.5 gcm"'^, independent of 
radius, in analogy with the densities of Earth, Neptune, Jupiter, and HAT-P-26b. In the right column (panels e-h) density increases with 
decreasing planet radius, as depicted by the inset density functions. Density functions that increase above ~4.0 gcm~^ for planets with 
Rp < 3 yield greater consistency between the Eta-Earth Survey and Kepler. 
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We have attempted to measure pristine properties of 
planets that can be compared with, and can inform, the- 
ories of the formation, dynamical evolution, and interior 
structures of planets. We have built upon the unprece- 
dented compendium of over 1200 planet candidates found 
by the historic Kepler mission (Borucki ct al. 2011). One 
goal here was to measure planet occurrence — the num- 
ber of planets per star having particular orbital periods 
and planet radii — by minimizing the deleterious effects of 
detection efficiencies that are a function of planet prop- 
erties, notably radius and orbital period. 

Our treatment of the vast numbers of target stars and 
transiting planet candidates involved careful accounting 
of two important effects. First, only planets whose or- 
bital planes are nearly aligned to Kepler^s line of sight 
will transit their host star, leaving many planets un- 
detected. We applied the standard geometrical correc- 
tion for the small probability, i?*/a in equation that 
the orbital plane is sufficiently aligned to cause a tran- 
sit. In counting planets, we assumed that for each de- 
tected planet candidate there are actually a/Ri, plan- 
ets, on average, at all inclinations. Second, only plan- 
ets whose transits produce photometric signals exceeding 
some signal-to-noise threshold will be reliably detected. 
For each possible planet radius and orbital period, we 
carefully identified the subset of the Kepler target stars 
a priori around which such planets could be detected 
with high probability. We adopted a threshold SNR of 
10 for the transit signal in a single 90 day quarter of data, 
thereby limiting both the target stars and the planet de- 
tections with this SNR threshold. To be included, a tar- 
get star must have a radius and photometric noise that 
allowed a planet detection with SNR > 10, i.e. a tran- 
sit depth 10 times greater than the uncertainty in the 
mean depth from noise. Such restricted target stars offer 
a high probability that planets will be detected. 

We further selected Kepler target stars having a spe- 
cific range of Tctf, ^ogg, and brightness to ensure a well 
defined sample of stars. We consider only bright tar- 
get stars (Kp < 15). We ignore all other Kepler target 
stars and their associated planets. Remarkably, this a 
priori selection of Kepler target stars immediately yields 
a sample of only ^58,000 stars (and fewer when account- 
ing for requisite photometric noise), not the full 156,000 
stars. For most of the paper, we restricted the sam- 
ple to main sequence G and K stars (log 5 = 4.0-4.9, 
Tcff = 4100-6100 K) to permit comparison with similar 
Sun-like stars in the Eta-Earth Survey. This selection 
of Kepler target stars for a given planet radius and or- 
bital period crucially leaves only a subset of stars in the 
"sub-survey" for those planet properties. Importantly, 
for planets with small radii (near 2 R^) and long peri- 
ods (near 50 days), only some 36,000-49,000 stars are 
amenable to detection of such difficult-to-detect planets, 
as shown in the the annotations in the lower left corners 
of the cells in Figure SI By counting planets and dividing 
by the number of appropriate stars that could have per- 
mitted their secure detection we computed the planets 
per star for a specific planet radius and orbital period 
(within a specified delta in each quantity). 

6.2. Comparison with Worucki et "all ^201 A) 

It is worth de scribing the differences between this 
paper and iBorucki et al.l ([2011.) resulting from differ- 



ing goals and met hods. The primary propose of 
IBorucki et al.l (|201lD was to summarize the results of the 
Kepler observations and to act as a guide to the tables of 
data. In doing so they tripled the number of known plan- 
ets (even w hen allowing for a fal se positive rate of ^5%; 
IMorton fc J ohnson 2011). B orucki et al.l (|2011l ) consid- 
ered the number distributions of all planets detected by 
Kepler, independent of the properties of their host stars 
(Toff, log (7, Kp, (TcDPp)- They also computed the "intrin- 
sic frequencies" of planetary candidates, a close cousin of 
our planet occurrence measurements, and plotted these 
frequencies as a function of T^g. 

The results in this pa per are derived dire ctly from 
the planets announced in IBorucki et al.l (120111) and from 
stellar parameters in the KIC ( Brown et al.ll20lTI ). We 
measure the occurrence distributions of planets orbiting 
bright, main sequence G and K stars, which represent 
only a thi rd of the star s obse rved by Kepler and con- 
sidered in IBorucki et al.l (j2011|) . Our desire for high de- 
tection completeness compelled us to consider only ro- 
bustly detected planets satisfying i?p > 2 R^, F < 50 
days, SNR > 10 in 90 days of photometry, and stars with 
Kp < 15. This selection of stars and pla nets facilitated 
comparison with the Eta-Earth Survey (|Howard et al.l 
I2010i) , which focused on the Doppler detection of planets 
orbiting G and K dwarfs with P < 50 days. In this paper 
we measured the detailed patterns of planet occurrence 
as a function of i?p, P, and Teff , only for that subset of 
stars and interpreted these distributions in the context 
o f planet format i on, ev olution, and composition. 

IBorucki et al.l (j2011[ ) chose to compute intrinsic fre- 
quencies in small domains of semi-major axis and planet 
radius while wc work in a space of orbital periods and 
planet radii. There are trade-offs with these choices. We 
chose to work in period space because Kepler directly 
measures orbital periods and translating to semi-major 
axes requires either assumed stellar masses or radii. On 
the other hand, by working in small domains of semi- 
major axis, IBorucki et al.l (|201lD compensate for this by 
considering the range of orbital periods and transit du- 
rations that contribute to each domain for the range of 
masses and radii among the target stars. In this pa- 
per we applied a binary detection criterion of SNR > 
10 fo r 90 days of photom etry (approximately one quar- 
ter). IBorucki et al.l (j2011|) adopted a detection criterion 
of SNR > 7 for the 136 days of Q0-Q2, with corrections 
for the probability of low SNR detections (e.g. 7-a detec- 
tions are only recognized 50% of the time). 



6.3. Patterns of Planet Occurrence 

Figure |4| shows graphically some of the key features of 
close-in planet occurrence. The number of planets per 
star varies by three orders of magnitude in the radius- 
period plane (Figure Sj) that spans periods less than 50 
days and planet radii less than 32 R^. Planet occur- 
rence increases toward smaller radii (see Figure |5|) down 
to our completeness limit of 2 i?^, with a power law 
dependence given by d/(i?)/dlog]^g i? = kfjR" where 
d/(i?)/dlog]^Q i? is the number of planets per star, for 
planets with P < 50 days in a logj^Q radius interval cen- 
tered on R (in P®), kn = 2.9toi , and a = -1.92±0.11. 
This is a remarkable result, showing that from plan- 
ets larger than Jupiter to those only twice the radius 
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of Earth planet occurrence rises rapidly by nearly two 
orders of magnitude. This rise with smaller size is con- 
sistent with, and supports the measured rise of, the 
planet occurrence w ith decreasing planet mass found by 
[Howard e t alJ (|2010( ). The increased occurrence of small 
planets seen in both studies supports the core-a ccretion 
theory for planet formation (jPoUack et al.|[T996() . 

Planet occurrence also increases with orbital period 
(Figure El) in equal intervals of logF as f{P) — 
kpP^ (l — e"^^/^"-'^) with coefficients that all depend 
on planet radius, and both /3 and 7 being positive. This 
functional form traces the steep rise in planet occurrence 
near a cut-off period, Pq. Below Pq planets are rare, but 
for longer periods the planet occurrence distribution rises 
modestly with a power law dependence. We find that Pq 
and 7 (which governs how steep the occurrence fall-off is 
below Pq) depend on planet radius. The smaller planets, 
Rp = 2-4 i?0 , have Pq ~ 7 days, while larger planets 
have a Pq ^ 2 days. Further, 7 is larger for planets with 
Rp > 4 P0 making the fall-off in planet occurrence more 
abrupt below Pq. The trends suggest that the mecha- 
nisms that caused the planets to migrate and stop at 
close orbital distances depend on planet size. Alterna- 
tively, if a substantial number of small close-in planets 
formed by in situ accretion, t hen our measure ments trace 
the contours of this process (iRavmond et al.l [2008'). 

This period dependence of planet occurrence seems to 
contradict the results from Doppler surveys of exoplan- 
ets for which we find a pilc-up of planets at periods of 3 
days and a nearly flat distribu tion of planets for longer 
periods, out to periods of 1 yr ([Wright et al.ll2009l ). The 
key difference is that Kepler is sensitive to much smaller 
planets (in radius and mass) than were Doppler surveys, 
especially beyond 0.1 AU. To be sure, Kepler suffers a 
geometrical decline in detectability as P*/a but we have 
corrected for this trivially. Such a correction is more 
difficult for Doppler surveys that have less uniform de- 
tectability from star to star. 

Another difference in the period distributions between 
Kepler and Doppler surveys is in the pile-up of hot 
Jupiters at orbital periods near 3 days (Figures H] and 
[6]). The Kepler detected planets show a pile- up, but it 
is modest, almost not significant, while for single planets 
in Doppler surveys the pile-up is a factor o f three above 
the ba,ckground occurrenc e at other periods (jMarcv et al.l 
I2008t I Wright et alll2009f ). This different planet occur- 
rence for hot Jupiters appears to be real, and may be due 
to fewer metal-rich stars in the Kepler sample that are 
located 50-200 pc above the Galactic plane, or different 
stellar mass distributions in the magnitude-limited and 
volume-limited surveys. The Kepler field has a greater 
admixture of thick disk stars (that are metal poor with 
[Fc/H] « —0.5) to thin disk stars than do the Doppler 
target stars. Other photometric surveys have noted that 
hot Ju piter occurrence appea rs to vary with stellar popu- 
lation. 'Gillila nd et al.l (|2000( ) found no planets in a Hub- 
ble Space Telescope survey of the globular cluster 47 Tu- 
canae and estimated a hot Jupiter occurrence that is an 
order of magn itude lower than in the solar neighborhood. 
Similarly, We ldrake et al.l ()2008D found no planets in the 
uj Centauri globular cluster and found the occurrence of 
hot Jupiters (P = 1-5 days) to b e less than 0.0017 plan- 
ets per star. iGould et al.l ()2006[ ) found an occurrence of 



0.003_Q Q02 hot Jupiters per star for P = 3-5 days, based 
on the magnitude-limited OGLE-HI survey in the bulge 
of the Galaxy, and are compatible with our results from 
Kepler, 0.005 ± 0.001 planets per star for Rp= 8-32 P®, 
P < 50 days, and Kp < 16. 

We further find that planets larger than 16 P® (1.5 
Pjup) are extremely rare. Such inflated planets are 
also rare among transiting planets detected from the 
ground (see, e .g., the mas s- radius diagram for gas gi- 
ant planets in iBakos et al.l (|2010bl )). For several Gyr- 
old planets, theoretical mass-radius curve s show a max- 
imum near Pp « 13 P® « 1.2 Pjup (jFortnev et al.1 
l2007bf) . Larger planets are typically young or close- 
in an d inflated by one of several proposed mechanisms 
("e.g.. iBatvgin fc Stevenso^ 120101 : iLaughlin et"all 120111: 
IBurrows et al.ll2007l ). 

We also note some interesting morphology in the two- 
dimensional occurrence domain of planet radius and or- 
bital period (Figure |3]). There is a ridge of higher planet 
occurrence for su per-Earths an d Nep tunes, similar to 
that identified in iHoward et al.l (|2010f ) . The ridge ap- 
pears to be diagonal when plotting either Mp or Pp vs. 
P extending from a period and radius of 3 days and 2 
P® (lower left) to a period and radius of 50 days and 
4 P®. This ridge can be seen by direct inspection of 
the Figure [H both by the density of the dots and by 
the colors. The upper envelope of red boxes (indicating 
high planet occurrence) extends along a diagonal from 
lower left to upper right. This ridge conveys some key 
information about the formation and perhaps dynamical 
evolution or migration of the 2-4 P® planets. 

The paucity of close-in Neptune- mass planets (Mp sin i 
= 10-100 M®, P < 20 days) seen in lHoward eFld] ()2010[ ) 
is not as clearly visible in the Kepler data. In particu- 
lar, the "top" of this desert (MpSini = 100 M®, or the 
radius equivalent) is not as clear. A further study of Ke- 
pler stars to fainter magnitudes of Kp = 16 may shed 
light on this desert. The overall planet occurrence for 
GK stars and periods less than 50 days, listed in Table 
3, shows that planets of 2-4 P® is 0.130 ± 0.008 plan- 
ets per star. This agrees well with the planet occur- 
rence of 3-30 M® planets found by , Howard et all (j2010l ) 
of 15^4%. The planet occurrence for all planet radii 
from 2-32 P® is on ly 16. 5%, again in a greement with 
IHoward et aTT poTot ) and iCumming et al . (2008). We 
find little support for the suggestion of planet occur- 
rences of super-Ear ths and Neptunes (Mp sin i = 3-30 
M®) of 30% ± 10% (|Mavor et al. l l2009h for P < 50 days. 

We also measured planet occurrence as a function of 
Pcff of the host star, a proxy for stellar mass. For the 
smallest planets, 2-4 P®, the results show a nearly lin- 
ear rise in planet occurrence with smaller stellar mass. 
One may wonder if this rise might be caused by some 
systematic error due to poor values of T^g or R^, in the 
Kepler Input Catalog. Such a systematic error seems 
nearly impossible, as the KIC values of Tcff are accurate 
to 135 K (RMS) and in any case the Tcs values certainly 
vary monotonically with the true value of Teff even if one 
imagines some large systematic error in the KIC values of 
Teff- Thus the increase in planet occurrence with smaller 
Toff and hence smaller stellar mass, appears to be real. 
Again, we emphasize that the SNR — 10 criterion for a 
Kepler target star to be included in our survey implies 
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that the detection efficiency is close to unity for all stars, 
from 7100 K to 3600 K, for i? > 2 i?®. Examination of 
Figure |8] shows that even if one ignores the coolest and 
hottest stars, the increase of planet occurrence persists 
robustly. Thus it appears that the number of planets 
per star increases by a factor of seven from stars of 1.5 
Mq to stars of 0.4 M© (Teg = 7100-3600K), with ah of 
that Teff dependence coming from the smallest planets, 
2-4 This high occurrence of close-in small planets 
around low mass stars represents significant information 
about the formation mechanisms of planets of 2-4 i?^ . 

We considered the possibility that this correlation is 
due to a systematic metallicity bias that depends on 
Tcff. That is, cool stars are relatively nearby, close to 
the galactic plane, and have higher metallicities, while 
hot stars are on average more distant, at greater heights 
above the galactic plane, and have lower metallicities. In 
this scenario, low metallicity is the driving force behind 
lower planet occurrence at higher Teff. Using the Besan- 
con galactic model, we estimate that metallicities may 
vary by ~0.1 dex as a function of Toff, but the depen- 
dence need not be monotonic because of the median age 
varies with Tcff • It would be remarkable if such a modest 
difference in metallicity could cause a factor of seven dif- 
ference in close-in planet occurrence. Unfortunately, due 
to the poor [Fe/H] measurements in the KIC, we are un- 
able to measure the occurrence of planets as a function 
of [Fe/H]. Note, however, that either result has profound 
implications for planet formation: the occurrence of 2-4 
i?0 planets depends strongly on stellar properties. Toff 
or [Fe/H]. 

Sub-Neptune-size and jovian planets appear to have 
opposite trends in occurrence as a function of M*. We 
showed that the occurrence of 2-4 planets decreases 
by a factor of s even with Mj, o ver ^ 0.4-1.5 M© {T^s = 
3600-7100 K). I Johnson etahl (|2010( ) measured the oc- 
currence of giant planets as a function of Mi, and [Fe/H] 
and found a positive correlation with both quantities. 
That is, the occurrence of giant planets increases with 
increasing Af* over the range ^0.3-1.9 M©. Their study 
considered only giant planets that produce K > 2Q 
ms"^ Doppler signals and orbit within 2.5 AU. Sub- 
giants with Mi,= 1.4-1.9 Mq have the highest rate of 
giant planet occurrence in their study. However, most of 
these planets orbit at ^1- 2 AU, with almost no planets 
inside of P = 50 day orbits ([Bowler et al.ll2010l ). Close-in 
planets of all sizes larger than 2 Rq appear to be rare 
around the most massive stars accessible to transit and 
Doppler surveys. 

6.4. Planet Formation 

Population synthesis models of planet formation by 
core accretion simulate the growth and migration of 
planet embryos embedded in a proto-planetary disk of 
gas and dust. Among their key predictions is the distri- 
bution of planet mass or radius as a fu nction of orbital 
distance. Early versions of these models (llda fc Lin|[2004l : 
lAlibert et al.ll2005l:[Mordasini et al.ll2009al^ were tuned to 
match the distribution of giant planets det ected by RV 
(jCumming et al.ll2008l: lUdrv fc San"tosll2007D by decreas- 
ing the rate of Type I migr a tion compared to t heoret- 
ical predictions (|Wardl 119971 : iTanaka et all I2002D . The 
simulations predicted that planet occurrence rises with 
decreasing planet mass. But most of the low-mass plan- 



ets resided in orbits near or beyond the ice line at ~2- 
3 AU. These models also robustly predicted a "planet 
desert" , a region of parameter space nearly devoid of 
planets. Planets with MpSini « 1-20 M© and a < 1 
AU were predicted to be extremely rare because produc- 
ing such planets requires the gas disk to dissipate while 
one of two faster processes were happening. Type II mi- 
gration or run-away gas accretion. Meanwhile, the mod- 
els predicted that planets with masses above the desert, 
M > 20 Af®, but residing inside of ^1 AU would exhibit 
a nearly constan t distr ibution with mass. 

[Howard et al] ([2010D demonstrated that the observed 
distribution of close-in planets (P < 50 days) exhibited 
quite different properties from those predicted by pop- 
ulation synthesis. The predicted planet desert is actu- 
ally populated by the highest planet occurrence of any 
region of the mass-period parameter space yet probed 
(the "ridge" noted above). The planet mass function 
rises steeply with decreasing planet mass, in contradic- 
tion to the expected nearly constant distribution with 
mass outside of the desert. From Kepler, we also see 
many planets populating the predicted desert (Figure 
[3]) and a planet radius distribution that rises steeply 
with decreasing planet size (tracking the mass distribu- 
tion). The latest versio n s of the population synthesis 
models ([Ida fc LinI [20Tot lAhbert et ail [20lTI ) offer im- 
provements includi ng non-isoth ermal treatment of the 
disk (Paardckoope r et al.ll20ldl) and multiple, interact- 
ing planet embryos per simulation. But they still pre- 
dict a planet desert (albeit partially filled in). The 
contours of planet occurrence in Figure [3| offer rich de- 
tail to which future refinements of these models can be 
tuned. Alternatively, the distribution of observed planets 
may be strongly shaped by processes that take place af- 
ter the gas clears, namely planet-planet scattering (e.g., 
Ford ct al. 2005; Ford & R asiol 120081: iChatteriee et alJ 
|2008; .Ravmond et al.n2009.), secular and resonant migra- 
tion Ce.g.. iLithwick fc Wul l20lol: IWu fc LithwicH l201ll) 
and planetesimal migration and growth (e.g.,'Kirsh et alJ 
2009; Capobianco et al. 2011; Walsh fc Morbidelh 2011). 
If these processes strongly shape the final planet distri- 
butions, then the planet distributions from population 
synthesis models (which truncate when the gas clears) 
will form the input to additional simulations that model 
post-disk effects and hope to match the presently ob- 
served planet distributions. 

Current planet formation theory must also adapt to 
account for remarkable orbital properties of exoplanets. 
Not included here is an analysis of the orbi tal eccentric- 
ities that span the range e = 0-0.93 (e.g., iMarcv et al.l 
I2005bt lUdn^ fc Santosll20"07t IMoorhead et al.ll201l ira^ 
the close-in "hot Jupiters" show a wide distribution 
of inclinati ons relative to the equ a torial plane of the 
host star (iWinn et all [2010l \20Ut iTriaud et all [2OIOI : 
iMorton fc JohnsonI 120101^ Thus, standard planet for- 
mation theory probably requires additional planet- 
planet gravitational interactions to explain these non- 
circular an d non-coplanar orbits ([Chatterjee et al.]l2010l : 
IWu fc Lithwick 2011). 

6.5. The Future of Kepler 

We strongly advocate for an improved catalog of stellar 
parameters for the '-^lOOO Kepler planet host stars and a 
comparably-sized control sample. Our occurrence mea- 
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surements and their interpretations would be strength- 
ened by an improved knowledge of R^,, \ogg, [Fe/H], and 
Teff. The i?* values from the KIC are only known to 35% 
(rms) which leads to a proportionally large uncertainty in 
Rp. We saw that hot Jupiters have a significantly lower 
occurrence in the Kepler sample than in RV surveys. 
We were unable to test whether this is due to differing 
metallicities of the host stars because [Fe/H] is poorly 
measured in the KIC. Similarly, we are unable to com- 
pletely rule out a metallicity gradient with height above 
the galactic plane as the underlying cause of the observed 
seven-fold decrease in the occurrence of 2-4 planets 
with increasing Toff. 

Finally, we note that Figure |3] shows representative 
planets having Rp ^ 2.5 Rq and P < 50 days, all of 
which reach SNR ~ 20 in four quarters of Kepler pho- 
tometry (and SNR ~ 10 in one quarter). If we consider 
the SNR for planets of radius 1 i?® , the transit depth is 6 
times shallower, implying total SNR values near SNR = 
20 / 6 = 3.3. Thus, planets of 1 i?©, even in short periods 
under 50 days, would not reach the threshold SNR for 
meriting a secure detection with current data in hand. 
For planets of 1 i?© to reach SNR ~ 6.6, Kepler must 



acquire four times more data, i.e. five years total, still 
constituting a marginal detection. Clearly an extended 
mission of an additional ~3 yr is needed to bring planets 
of 1 Ra. to SNR > 7. 
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